MASTER THESIS
Term paper submitted in partial fulfillment of the requirements
for the degree of Master of Science in Engineering at the
University of Applied Sciences Technikum Wien - Degree
Program Information Systems Management
Opportunity detection and trade simulation
system for arbitrage trading on the crypto
market
By: Lukas Fankhauser, BA
Student Number: 2110302055
Supervisor 1: Mag. Robert Jonas
Supervisor 2: Sophie Weidenhiller, BA
Vienna, 28/05/2023
Declaration of Authenticity
“As author and creator of this work to hand, I confirm with my signature knowledge of the
relevant copyright regulations governed by higher education acts (see Urheberrechtsgesetz/
Austrian copyright law as amended as well as the Statute on Studies Act Provisions /
Examination Regulations of the UAS Technikum Wien as amended).
I hereby declare that I completed the present work independently and that any ideas,
whether written by others or by myself, have been fully sourced and referenced. I am aware
of any consequences I may face on the part of the degree program director if there should be
evidence of missing autonomy and independence or evidence of any intent to fraudulently
achieve a pass mark for this work (see Statute on Studies Act Provisions / Examination
Regulations of the UAS Technikum Wien as amended).
I further declare that up to this date I have not published the work to hand nor have I
presented it to another examination board in the same or similar form. I affirm that the
version submitted matches the version in the upload tool.”
Vienna, 28/05/2023
Location, Date
Signature
3
Kurzfassung
Diese Arbeit untersucht das Potenzial des Arbitragehandels auf dem Krypto-Markt in einem
breiteren Spektrum als bisher. Betrachtet werden 48 verschiedene Kryptowährungen der 100
größten Marktkapitalisierungen, die an 16 Krypto-Börsen gegen Euro gehandelt werden.
Ein systematischer Ansatz wird anhand einer umfassenden Analyse historischer Daten der
letzten 16 Monate angewandt, um die Assets mit dem höchsten Potenzial für Arbitragen
auszuwählen. Diese Erkenntnisse werden durch die Entwicklung eines Prototyps für den
Arbitragehandel unter Verwendung zentralisierter Krypto-Börsen erweitert. Die identifizierten
Handelsmöglichkeiten werden auf Basis von Echtzeitdaten der Exchanges ermittelt und der
Handel mit Market Orders als Paper Trading simuliert.
Die Ergebnisse zeigen, dass der Krypto-Markt durch wiederkehrende Episoden von sich
öffnenden und schließenden Arbitragemöglichkeiten gekennzeichnet ist, welche bis zu
mehreren Tagen andauern können. Darüber hinaus weist dieser Eigenschaften auf, die das
Auftreten von Arbitragemöglichkeiten begünstigen. Zusammenfassend wird aus den
Ergebnissen abgeleitet, dass Arbitragehandel als profitable Strategie auf dem Krypto-Markt
angesehen werden kann.
Diese Thesis erweitert bestehende Studien über Arbitragehandel am Kryptowährungs-Markt,
indem sie den Umfang der betrachteten Assets und Exchanges erweitert. Die Ergebnisse
sind konsistent mit vorherigen verwandten Studien und bieten wertvolle Einblicke für Trader
und Forschende, indem sie das Potenzial von Arbitrage-Strategien und die Bedeutung von
automatisierten Handelslösungen hervorheben.
Schlagwörter: Arbitrage Handel, Kryptowährungsmarkt, Automatisiertes Trading, Trading
Strategien, Entwicklungs-Prototyp, Historische Preisdaten
4
Abstract
This thesis examines the potential of arbitrage trading in the cryptocurrency market, in a
broader spectrum, considering 48 diverse crypto assets within the top 100 with the highest
market cap, traded against Euro, on 16 exchanges. To provide a systematic approach for
selecting assets with the highest potential for arbitrage opportunities, a comprehensive
analysis of historical data over the last 16 months is conducted. Further these findings, are
extended with the development of an arbitrage trading prototype utilising centralized crypto
exchanges. The performance of the identified opportunities is evaluated using real-time data
of exchanges and trades are simulated with market orders via paper trading.
The results show that the crypto market is characterised by recurring episodes of opening
and closing arbitrage opportunities, which can last up to several days. In addition, the crypto
market shows characteristics, which should encourage arbitrage opportunities to occur.
Overall, the results indicate that arbitrage trading can be considered as a profitable strategy
in the cryptocurrency market.
It contributes to the existing body of knowledge on arbitrage trading in the cryptocurrency
market by expanding the scope of assets and exchanges considered. The findings are
consistent with previous connected studies and offer valuable insights for traders and
researchers, highlighting the potential of arbitrage strategies and the significance of
automated trading solutions.
Keywords: Arbitrage Trading, Crypto market, Automated Trading, Trading Strategies,
Development Prototype, Historical Pricing data
5
Acknowledgement
I want to direct my appreciation and gratefulness towards multiple people, without these, this
thesis would not have been finished.
First of all, I would like to thank my supervisor at the FH Technikum Wien, Mag. Robert
Jonas, for his valuable input and clear feedback to bring this thesis to its final stage.
Moreover, his great responsiveness and the quick possibilities of setting up meetings, when
needed, are worth emphasising.
Second, I would like to thank my company supervisor at Autowhale GmbH, Sophie
Weidenhiller, for her patience and valuable help with questions in the field of the crypto
market. In addition, I would like to show my gratitude that it was possible as an external to
write this thesis at this company, through which I was able to significantly increase my
knowledge and interest in this area.
Finally, words cannot express my gratitude to my family and friends for their support,
motivation and keeping my spirits high throughout the process of writing this thesis.
6
Table of Contents
1 Introduction .............................................................................................................. 8
1.1 Research Subject ..................................................................................................... 9
1.2 Structure ................................................................................................................. 10
2 Current State of Literature and Technology ........................................................... 11
2.1 Literature Review ................................................................................................... 11
2.1.1 Search Process ...................................................................................................... 11
2.1.2 Literature Summary ................................................................................................ 13
2.1.3 Literature Analysis .................................................................................................. 14
2.1.4 Conclusion ............................................................................................................. 15
2.2 Arbitrage Trading ................................................................................................... 16
2.2.1 Categorization and importance in trading strategies .............................................. 16
2.2.2 History of arbitrage trading ..................................................................................... 17
2.2.3 Types of arbitrage trading ...................................................................................... 18
2.3 Crypto market ......................................................................................................... 22
2.3.1 Types of exchanges ............................................................................................... 22
2.3.2 Fees ....................................................................................................................... 24
2.3.3 Price formation ....................................................................................................... 25
2.3.4 Order Books and corresponding definitions ........................................................... 27
2.3.5 Theoretical concepts for Arbitrage Trading ............................................................ 29
3 Methods ................................................................................................................. 34
3.1 Data collection & management .............................................................................. 35
3.2 Concept for crypto asset filtering ............................................................................ 36
3.3 Arbitrage trading prototype ..................................................................................... 36
4 Data collection & management .............................................................................. 38
4.1 Data sources .......................................................................................................... 38
4.2 Assets and Exchanges ........................................................................................... 40
4.3 Collection of data ................................................................................................... 44
4.4 Data pre-processing ............................................................................................... 46
5 Crypto asset filtering .............................................................................................. 47
5.1 Data cleaning & adjustments ................................................................................. 47
5.2 Measurements ....................................................................................................... 48
7
5.2.1 Arbitrage Index ....................................................................................................... 48
5.2.2 Price differences .................................................................................................... 50
6 Arbitrage trading prototype ..................................................................................... 52
6.1 Programming Language ......................................................................................... 52
6.2 Prototype Architecture ............................................................................................ 52
6.3 Finding Arbitrage Opportunities ............................................................................. 54
6.4 Exchange Connections .......................................................................................... 54
6.5 Dictionaries ............................................................................................................ 58
6.6 Testing ................................................................................................................... 60
7 Results ................................................................................................................... 61
7.1 Data collection & management .............................................................................. 61
7.2 Crypto asset filtering .............................................................................................. 63
7.2.1 Results for Arbitrage Index ..................................................................................... 63
7.2.2 Results for Price Differences .................................................................................. 69
7.3 Arbitrage trading prototype ..................................................................................... 75
8 Discussion .............................................................................................................. 77
8.1 Interpretation of results .......................................................................................... 77
8.2 Limitations .............................................................................................................. 79
8.3 Outlook & Future work ........................................................................................... 80
9 Conclusion ............................................................................................................. 82
8
1 Introduction
The beginning of today’s crypto market was marked by Satoshi Nakamoto's white paper
"Bitcoin: A peer-to-peer Electronic Cash System", which was published in October 2008
(Nakamoto, 2008). The first software for it was released in January 2009. Bitcoin was the first
widely adopted mechanism to provide absolute scarcity of a money supply, which used
cryptography to control the distribution and creation without the need for centralized
authorities like banks or governments (Böhme et al., 2015). After the appearance of Bitcoin,
more and more cryptocurrencies and exchanges appeared over the years, leading to a
market of thousands of assets and hundreds of exchanges with a market capitalization of
over 1 trillion dollars as of April 2023 (“Total Cryptocurrency Market Cap,” 2023).
Trading strategies are the basis for amateur or professional traders to generate profits in
asset markets like the crypto or stock market. Some of these strategies, which are often
automated, are also beneficent to the market. In the case of arbitrage trading, it benefits
price stability and reduces price differences. The idea lies in benefiting from market
inefficiencies (Heckel and Waldenberger, 2022), by selling an asset on a higher and buying
on a lower priced exchange simultaneously. Although not every trading strategy of traditional
financial markets can be applied in the crypto market.
In theory, economic equilibrium supporting hypotheses like the Efficient-Market Hypothesis
or the Law of one Price, which will be explained in the following chapters, should not allow
market inefficiencies, and therefore arbitrage opportunities to appear. Still several papers
showed, that these exist also over longer periods and with greater price differences across
geographical regions (Brauneis and Mestel, 2018; Duan et al., 2021; Makarov and Schoar,
2020). In addition, crypto markets show characteristics which, by theory, should encourage
arbitrage opportunities to occur, such as fast-moving markets, low regulations, high
accessibility, high number of speculators and often inefficient markets (Al-Yahyaee et al.,
2020; Duan et al., 2021; Dwyer, 2015; Levus et al., 2021; Makarov and Schoar, 2020).
Therefore, the motivation for this thesis is to address these opportunities in more detail for
multiple crypto exchanges and assets.
Addressing the research gap, the crypto market is still vastly researched academically, in
comparison to other financial markets, but there was an increase after the spike in 2017,
when the market generated a lot of public attention. To date, only a limited number of
academic studies have addressed the topic of arbitrage trading for the crypto market, with
the majority focusing on just a few cryptocurrencies, mostly Bitcoin and Ethereum.
Additionally, most papers focus on the general arbitrage conditions for this market and on
trading pairs, which are traded against US-Dollar. Very few papers considered diverse
assets, diverse exchanges and none known did this for Euro (EUR) pairs. In the scope of this
9
thesis, research will be expanded to include a broader range of crypto assets and
exchanges, to provide valuable insights into the efficiency and opportunities for arbitrage
trading within the cryptocurrency market. Additionally, a software prototype for an arbitrage
trading system will be implemented.
1.1 Research Subject
In this thesis, four research questions (RQ) were formulated. One of them is the main
research question and the other three pose one question per method.
Nr.
Research Question
Methods
1
How are traditional trading strategies of financial
markets, such as arbitrage trading, also
applicable with assets on the crypto market in an
automated way?
Data collection &
management,
Crypto asset filtering,
Arbitrage trading prototype
1.1
Which information must be gathered to enable
decision making for arbitrage trading
opportunities?
Data collection &
management
1.2
What are the requirements and criteria, for a
crypto asset to be considered for arbitrage-trading
systems?
Crypto asset filtering
1.3
How can an arbitrage trading strategy in the
crypto market be realized as a software
prototype?
Arbitrage trading prototype
Table 1: Research Questions and used methods.
The expected results for the main research question are to demonstrate the existence and
exploitability of arbitrage opportunities in the crypto market. This will be shown by a concept
for finding suitable assets and exchanges based on historical pricing data. Subsequently,
these findings are validated, by implementing a development prototype for arbitrage trading,
which evaluates arbitrage opportunities with live data of exchanges. The research aims to
provide evidence that arbitrage trading can be a potentially profitable strategy also in the
crypto market.
For RQ 1.1, the expected results are to discover information and requirements needed to
determine the suitability of assets and exchanges for arbitrage trading. This is demonstrated
by the provision of historical price data of the last 16 months in the correct format and
structure, which are required for further data-processing and for enabling decisions to be
made.
For RQ 1.2, the expected outcomes are to prove the existence of arbitrage opportunities with
a concept to evaluate suitable crypto assets and exchanges. This is shown by conducting an
analysis with two chosen mathematical measurements, based on historical pricing data. The
research is expected additionally, to conclude that discovered arbitrage opportunities, can
also be found with an arbitrage trading prototype with live data of exchanges.
For RQ 1.3, the expected results are to evaluate that arbitrage opportunities can not only be
found based on historical pricing data, but also live between crypto exchanges. This is shown
by identifying and utilising the correct technologies and developing a functional software
prototype, which is capable of connecting to real-time data feeds of crypto exchanges and
can discover price differences for an asset and simulate trades based on it.
1.2 Structure
The content of this thesis consists of nine chapters. First, a general introduction of the topic
and proposing the research gap, motivation and research subject is written. In the second
chapter, a comprehensive literature research is conducted, explaining the relevant basics
and correlations of arbitrage trading and the crypto market. Subsequently, the scientific
methods used are explained, followed by the implementation of each of those three in
separate chapters. Next, the results of the methods are presented. Thereafter, those are
discussed, limitations are presented, and an outlook on future work is given. Finally, the
outcomes and research subject of the applicability of arbitrage trading in the crypto market
are concluded.
2 Current State of Literature and Technology
2.1 Literature Review
Arbitration is not a novel idea and has been around for a long time. Banks and other financial
institutions all over the world have been utilizing this technique in the stock, forex, commodity
and other markets for decades. The intention in this thesis is to review and implement the
application of arbitrage trading in the crypto market in a similar manner, building on findings
of existing literature. The cryptocurrency market presents itself as no exception to this
concept, as the divergences between prices are often greater compared to traditional
markets. This seems to be due to the large number of cryptocurrencies and exchanges, low
government regulation, decentralization, high degree of volatility and a high number of
speculators, which makes it difficult to achieve a consistent price and therefore provides a
good basis for arbitrage (Levus et al., 2021).
2.1.1 Search Process
The search process is a key element in the conduct of a literature review, so there are
several exclusion and inclusion factors, which were considered.
Including factors were used to narrow the search primarily, such as:
- Language - In order to avoid ambiguity between documents in different languages,
the documents included in the search must be in English.
- Accessibility - Articles must be freely available via Open Access or must be
accessible via an institutional access system for universities.
- Keywords - The keywords listed in Table 2 must be included in the document.
Exclusion factors were used to refine the search and obtain accurate results, such as:
- Literature type - Only peer-reviewed, published papers in journals and books were
considered for this review.
- Abstract content - The abstract of a paper must relate to arbitrage trading and should
to the cryptocurrency market.
- Misleading terms Some terms are strongly correlated with the topic of this paper,
but due to the thematic divergence, these have been excluded.
Search engine used
Keywords used
Number of results
search.onb.ac.at
Österr.
Nationalbibliotheken
Arbitrage Trading,
Arbitrage Handel
54, 37
buechereien.wien.gv.
at
Arbitrage Trading
2
ProQuest Ebook
Central
Arbitrage Trading,
Arbitrage Crypto
98, 8
base-search.net
Arbitrage Trading,
Arbitrage Handel,
Arbitrage Crypto
9603, 49, 55
sciencedirect.com
Arbitrage Trading,
Arbitrage Crypto, Arbitrage
Cryptomarket, Arbitrage
Bitcoin, Arbitrage
Ethereum
20095, 168, 10, 383, 180
econbiz.de
Arbitrage Trading,
Arbitrage Handel,
Arbitrage Crypto, Arbitrage
DeFi, Arbitrage Bitcoin
8205, 1064, 13, 153, 81
Table 2: Keyword search results
Arbitrage Trading is used since a long time, therefore there is a lot of literature to find. Books
on trading strategies are available in abundance, however, when narrowing them down to
crypto assets, they are considerably reduced. Therefore, mainly peer-reviewed and
published papers in journals where found. Since the world of trading and crypto market, is
not dominated by the German market, pretty much all the resources found were in English.
As a consequence, the search was limited to English resources only.
2.1.1.1 Clarification of Terms and scope
In the first phase of the literature review, a lot of literature was found, which are strongly
related to arbitrage trading, but thematically have a different meaning or are not in the scope
of this thesis. For this reason, a few resources had to be excluded from the search based on
certain terms.
These were for instance:
- Statistical Arbitrage: Similar in sound, but statistical arbitrage trading denominates
pairs trading, which is not the trading strategy covered in this paper and had to be
therefore excluded, as it is often just simply named arbitrage trading.
- Impact on the market: As arbitrage opportunities exist in inefficient market, there is a
lot of literature, how arbitrage impacts the market and how to eliminate these
opportunities or close the gap till arbitrages disappear.
2.1.2 Literature Summary
Searching for the exact title of this work on various search engines will yield no results for
other works. This presumably results from the fact that the title is defined quite specifically.
Simplifying the search terms related to arbitrage or cryptocurrencies, several papers could be
found. Additionally, it was observed that the topic of arbitrage trading, is often brought
together with the topic of market efficiency. Which is clear, since opportunities for arbitrage
trading arise from inefficiencies in markets.
Most search results were returned by publications that addressed the topic of arbitrage
trading in general. From these, 29 papers came into closer consideration. Out of those, four
address arbitrage theory of the non-crypto market (Angerer et al., 2023; Fernández-Pérez et
al., 2012; Heckel and Waldenberger, 2022; Kiuchi, 2022). Two papers discussed the
limitations of the theory of arbitrage in the non-crypto market (Gromb and Vayanos, 2018,
2002). Six explore High-Frequency-Trading, where arbitrage trading can be a type of, when it
is automated (Brogaard et al., 2014; Brogaard and Garriott, 2019; Budish et al., 2015;
Carrion, 2013; Kiuchi, 2022; O’Hara, 2015).
Four papers are related to the crypto market and addressed the theoretical concepts of
trading systems and specifically also arbitrage trading (Bruzgė and Šapkauskienė, 2022;
Kabašinskas and Šutienė, 2021; Makarov and Schoar, 2020; Mohan, 2022). Three
publications describe this topic further with technical concepts (Kakushadze and Yu, 2019;
Levus et al., 2021; Pauna, 2018). Since arbitrage opportunities exist through market
inefficiencies, most literature about arbitrage trading in the crypto market in fact deals with
the subject of market efficiency, where ten papers were considered (Berg et al., 2022;
Brauneis and Mestel, 2018; Clements, 2021; Duan et al., 2021; Holste and Gallus, 2019;
Krückeberg and Scholz, 2020; Lee et al., 2020; Saengchote, 2021; Urquhart, 2016; Zhang et
al., 2018) .
2.1.3 Literature Analysis
When analysing the literature, it becomes apparent that certain topics were addressed more
often and in more detail. Research on crypto assets in finance and economics is still in its
infancy compared to traditional markets. Most studies in this area focus on the practical
implications of using cryptocurrencies as a form of payment and conducting transactions.
The first serious research was done on the economical dynamics, theory and price formation
of bitcoin (Ciaian et al., 2016; Dwyer, 2015).
Literature about arbitrage trading exists long before the 2000s. Since the start of Bitcoin in
2008, the first academic papers about arbitrage trading in the crypto market, therefore
especially for bitcoin, appeared in 2012. The first literature on efficiency appeared in 2016
and got significantly more track after the first spike of the crypto market in 2017. The first
literatures about practical approaches to automated arbitrage trading systems in the crypto
market appeared since 2018 (Kakushadze and Yu, 2019; Levus et al., 2021; Makarov and
Schoar, 2020; Pauna, 2018).
Using a long-memory method, a study (Duan et al., 2021) analyses the development of
informational efficiency and its effect on cross-market arbitrage opportunities. The findings
indicate that all the biggest five crypto markets studied were nearly fully informationally
efficient over the sample period, however, the level of market efficiency varied among
markets and over time. A further study (Al-Yahyaee et al., 2020) shows that the top ranked
cryptocurrencies are not efficient, which aligns with other previously published findings and
concludes that the inefficiency of crypto markets is subject to change over time. Additionally,
the study examines the multifractality, long-memory process, and efficiency hypothesis of six
major cryptocurrencies (Bitcoin, Ethereum, Monero, Dash, Litecoin, and Ripple). Two other
papers support that by discussing the limitations of the theory of arbitrage, which suggests
that prices may not always align with the law of one price, even when arbitrageurs are
present (Gromb and Vayanos, 2018, 2002). In addition, one paper found that cryptocurrency
efficiency increases with liquidity (Brauneis and Mestel, 2018). Considering risks, general
investments in the crypto market are considered as risky, whereas arbitrage trading as low-
risk or even risk-free (Bruzgė and Šapkauskienė, 2022; Makarov and Schoar, 2020; Mohan,
2022), taking into account that any trades underly a basic execution risk (Krückeberg and
Scholz, 2020).
There are several factors from which the success of automated arbitrage systems can be
derived. Multiple studies show, that the success of such systems is determined by how
quickly they can search and transmit information, and therefore recognize arbitrage
opportunities and execute trades, specifically in comparison to the speed or latency of other
traders (Brogaard et al., 2014; Brogaard and Garriott, 2019; Budish et al., 2015; Carrion,
2013; Kiuchi, 2022; O’Hara, 2015).
2.1.4 Conclusion
Arbitrage trading is a well-tested trading strategy, which in addition to generating profits, also
benefits market efficiency, by increasing price stability and reducing price differences. A
number of studies have already been carried out in the crypto market, but it is still at an early
stage. It has to be taken into consideration that markets cannot be directly influenced, due to
the factor of decentralization. A paper from 2020 shows, that some top ranked crypto
exchanges are not efficient and it is a subject to change over time (Al-Yahyaee et al., 2020).
Even when markets are nearly informationally efficient, efficiency still varies (Duan et al.,
2021). Furthermore, it can be assumed that markets become more efficient, when liquidity
increases (Brauneis and Mestel, 2018). Therefore, due to these inefficiencies, it can be
concluded, that a suitable basis for finding arbitrages exist. This is also contributed by the
facts, that there is a large number of crypto assets and exchanges, low government
regulation or influence, high degree of volatility and a high number of speculators (Levus et
al., 2021). Moreover, it can be concluded that a major role in the success of arbitrage trading
in the crypto market are speed and latency in comparison with other traders (Brogaard et al.,
2014; Brogaard and Garriott, 2019; Budish et al., 2015; Carrion, 2013; Kiuchi, 2022; O’Hara,
2015).
2.2 Arbitrage Trading
Arbitrage trading has long been established in traditional financial markets and its application
to the crypto market presents unique opportunities and challenges. The following chapter
gives an overview of the categorization in trading strategies, history and the different types
available.
2.2.1 Categorization and importance in trading strategies
Algorithmic trading has established itself in the markets, used by individual and institutional
market participants. In order to get an overview of different trading strategies and to be able
to classify arbitrage trading, it can be categorized in six types by their objective and methods
(Kiuchi, 2022):
1. Execution algorithms: These automate the allocation and timing of buy and sell
orders, select optimal markets and make adjustments to achieve goals such as cost
reduction. Some of these algorithms hide trade execution from other investors,
reducing the cost of market influence, others ensure compliance with market rules.
Often large orders are split into smaller ones and placed in stages to reduce market
impact. An important task of execution algorithms is therefore to determine and
implement optimal timing to minimize the sum of these two costs.
2. Benchmark execution algorithms: These are used, especially when executing
large orders, to ensure that the average price of each small order resulting from it
matches a benchmark such as the market closing price in order to limit the cost of
market manipulation.
3. Market-making algorithms: Algorithmic market makers place both buy and sell
orders at lower prices than the current market price and try to profit from the
difference between the market price and the bid or ask price. Their main goal is to
contribute or provide market liquidity and stability, while benefiting from it.
4. Arbitrage Algorithms: Arbitrage algorithms utilize occurring price differences of
identical assets by simultaneously selling at the higher price and buying at the lower
price. In this way, they try to make profits and limit the risk of price changes. They
help to eliminate distortions in the markets and thus increase market efficiency. This
strategy will be examined in detail in this thesis for the crypto market.
5. Directional Algorithms: These algorithms use market data such as prices, trading
volumes and news to predict market price changes and profit from unidirectional
changes in market prices. This trading strategy is generally high risk but also high
return.
6. Market Manipulation Algorithms: Market manipulation algorithms are used to
influence market prices in their favour by providing false information about liquidity
and intent, thereby misleading other market participants. They can lead to lower
trading costs and profits, but also to delays or prevention of orders from other market
participants. These algorithms can enable users to make significant profits, but are
ethically troublesome, have a negative impact on market efficiency and can have a
greater impact especially on smaller cryptocurrencies.
It can be concluded that arbitrage trading is an important trading strategy, benefiting from
trading inefficiencies in the market. In doing so, one can see that it is one of the positive
strategies towards the market as it simultaneously is a driver of market efficiency.
2.2.2 History of arbitrage trading
The first mention of the concept of arbitrage trading is found in the Hammurabi Code about
1760 BC, which dealt extensively with trade and financial matters across geographical
regions. Arbitrage trading of coins and bars across different geographical regions was
common, but the minting of coins by different political jurisdictions and the lack of a
standardized unit of account made trades difficult. The introduction of a standardized
currency expanded the opportunities to allow geographical arbitrage of physical coins to take
advantage of different exchange rates. Opportunities for arbitrage arose from the trading
activities of networks of traders and money changers and included uncovered interest
arbitrage between areas with low interest rates and those with high rates. However, there
were challenges such as lack of liquidity, difficulties in obtaining information and transporting
goods over distances, and inherent political and economic risks (Poitras, 2010).
In the 16th century, exchanges like the one in Antwerp replaced medieval fairs as important
international venues for exchange trading. Medieval bankers operated arbitrage exchanges
to profit from discrepancies in exchange rates, and by the 18th century the exchange market
had developed in financial centres such as Amsterdam, London, Hamburg, and Paris. During
the 18th century, the depth and breadth of exchanges expanded significantly, which lead to a
development of different types of securities and commodities, which could be traded. The
increased speed of communication between major financial centres, directed by the
introduction of the telegraph and the ticker, made it easier to trade between exchanges and
to engage in geographical arbitrage. In the 19th century, records of trading in options were
added, in which, for example, short positions in Constantinople were combined with a written
put and a bought call in London (Poitras, 2010).
After trading in assets such as shares and bonds, the emergence of Bitcoin has now been
followed by trading in electronic currencies, opening new opportunities for arbitrage, which
can be completely automated.
2.2.3 Types of arbitrage trading
Arbitrage is the execution of a specific sequence of actions that begin and end with the same
asset and whose completion results in an increased value at the end of the sequence (Levus
et al., 2021). For example, if an asset A is bought for 20 on stock exchange X and sold for
20.50 on stock exchange Y, a profit of 0.50 will be made, not including transaction costs.
Like the evolution of financial markets and their possibilities, also arbitrage trading became
an overarching term with several sub-concepts and ways of functionalities. Therefore, this
trading concept is not new and there are different types of price spreads in every financial
market, including stock exchanges or currency markets. Banks and other financial institutions
itself around the world have been using this mechanism for hundreds of years to exploit price
discrepancies and bring efficiency to the markets (Levus et al., 2021).
The crypto market is no exception to arbitration and works almost entirely on the same
principles as in traditional markets, but with different assets. Today there are about 4000
cryptocurrencies and multiple exchanges throughout every region in the world, with Binance,
Kraken, Coinbase, Bitfinex, Bittrex as some of the most popular. Arbitration in the crypto
market is particularly attractive because low levels of state regulation, decentralization,
different markets, large numbers of speculators looking to make money and high volatility
create difficult conditions for a single price to exist (Krückeberg and Scholz, 2020). Large
price divergences between exchanges also often occur against a backdrop of political
instability. Thus, in August 2019, in Argentina, against the backdrop of a sharp fall in the
national currency, 1 bitcoin was priced 4% higher in US-Dollars than on international
platforms. However, price differences between exchanges are more common than they
seem, even when extreme economic and political events are excluded (Levus et al., 2021).
A further financial definition needed for all types of arbitrage trading is the concept of a
currency pair. This is a pair consisting of two currencies that are traded against each other.
The first currency is usually referred to as the base currency and the second as the quota.
For example, you need to find the BTC/EUR pair if you want to buy Bitcoin (BTC) for EUR.
Besides cryptocurrency/fiat currency pairs, there are also cryptocurrency/cryptocurrency
pairs, which are often more popular than fiat currency pairs due to trades between
cryptocurrencies (Levus et al., 2021).
The already mentioned economic equilibrium occurs when there are no more arbitrage
opportunities left, since the arbitrage-free condition essentially implies that there is no price
mismatch between the two markets. There are several benefits of arbitrage towards these
equilibriums. First, it is attractive because it offers low-risk profits, so one can be sure that
agents will rush to arbitrage opportunities when they arise. Second, the process of arbitrage
conveniently removes these opportunities at some point and markets will move towards
equilibrium where prices are equal, to the extent permitted by transaction costs (Mohan,
2022).
2.2.3.1 Pure arbitrage / Two-point arbitrage
The opportunity for two-point arbitrage appears, due to a difference in prices across different
exchanges for the same asset at time
!
. It refers to the fact that an asset that is bought in one
market can be sold simultaneously in the other market in order to realise a profit without any
risk in theory. If there is a mismatch between the prices quoted in two exchanges, it is
profitable to do so (subject to transaction costs) (Mohan, 2022). To make this method
possible, a trader must hold a positive balance of the corresponding assets or fiat money on
the respective exchanges at the time of a trade. Obviously, the arbitrageur's balance of
assets would fall on the higher priced exchange, where the assets would be sold, and rise on
the lower priced exchange, since this is where purchases take place. To replenish this,
transferring assets or capital from the exchange with the high balance onto the exchange
with the low balance and vice versa is required. Ideally, it should be able to transfer profits
immediately from the expensive to the cheap markets and then repeat the arbitrage, till
prices converge. The faster the arbitrageur can recycle capital from one account to another,
the more effective the arbitrage (Makarov and Schoar, 2020). While exploiting it, it is also the
effect of arbitrage to eliminate price distortions. Therefore, speed is crucial, as the first
person to seize the arbitrage opportunity can make the greatest profit (Kiuchi, 2022).
In two-point arbitrage, traders use market orders because they calculate with a specific price
(Foucault et al., 2005). The uncertainty of a matching limit order (if at all) increases the risk of
price fluctuations, when an order is processed.
Exchange 1
Exchange 2
Asset A
Asset A
higher price
lower price
Buy
Sell
Price difference = Profit
Figure 1: Visualization of a two-point arbitrage trading process Figure 1: Visualization of a two-point arbitrage trading process
As already mentioned, if a profitable arbitrage opportunity is found, the arbitrage trade
"
with
the actions of buying and selling is done simultaneously with the same volume. On the
exchange with the lower price, a buy and on the other one with the higher price, a sell is
executed. For each trade
"
, the profit
#
!
is equal to the difference between the sell
$%&&
!
and
buy
'()
!
price multiplied by the volume
*
!"
traded. The formula to calculate the profit is
(Pauna, 2018):
#
!
+ ,*
!
-$%&&
!
.,'()
!
/
As before mentioned, crypto assets are traded in pairs, like BTC/EUR or BTC/ETH. If the
second example is considered, it defines how much of Ethereum (ETH) one need to “sell” or
trade in, to buy a defined unit of BTC. Therefore, if one exchange quotes you certain price to
buy BTC/ETH and another exchange quotes a higher price for the same pair, a two-point
arbitrage trade and profit from the price difference can be made. Sometimes a problem
occurs between exchanges, when the dependency between two crypto assets is not
expressed in the same way. One exchange can state a price for instance for BTC/ETH and
another one for ETH/BTC. The assets involved are the same, but the prices are constructed
the other way round. To work around this issue, the software must compare the price from
the first exchange with the price from the second one, to calculate and match the prices of
the trade correctly. In this case, a reversed arbitrage trade is done, where the formula for the
profit is (Pauna, 2018):
#
!
+ ,*
!
-$%&&
!
.,
0
'()
!
/
2.2.3.2 Triangular arbitrage
In difference to pure arbitrage, which requires two price quotes, from each market for
arbitrage possibilities to emerge, triangular arbitrage needs three for its implementation. This
method can also be performed cross-exchange, but is mostly used in one exchange to
exploit internal price misalignments (Mohan, 2022). When carried out in one specific
exchange, it is the ideal way to identify market-specific frictions and to compare price
efficiency of different markets (Barbon and Ranaldo, 2021).
Consider three tokens
1
,
2
, and
3
, available for trading, where a trader can swap between
any pair. The idea of triangular arbitrage is to initiate a sequence of trades, starting with one
asset, converting to another one and then converting it back to the starting asset and ending
with more units than at opening. For example, one takes 1 unite of any asset represented by
3
, then loops through the assets by e.g., selling
3
for
1
, then
1
proceeds for
2
before
converting back to
3
and resulting in a surplus of more than 1 unit of
3
. This sequence of
conversion is represented by
3, 4 ,1, 4 ,2, 4 ,3
. Although it can also be
3, 4 ,2, 4 ,1, 4 ,3
, if
executed in a different order, order
2
or
1
can also be the starting assets. In either case, if
the trader starts with 1 unit of an arbitrary asset (
152,67,3
) and ends with more than 1 unit, a
triangular arbitrage trade is successful. A triangle can represent all these possibilities, with
the three assets at the vertices and a certain cycle tracing a path along the sides, beginning
and ending with a particular vertex (Mohan, 2022).
It should be noted that this method of arbitration requires rapid data collection and analysis
and consequently, the execution of the necessary trading sequence. Trading commissions
and types of order execution must also be taken into consideration. A number of algorithms
are used to find such arbitrage opportunities by finding the shortest path in a weighted graph
from one node to another (Levus et al., 2021).
Figure 2: Visualization of a triangular arbitrage trading process
2.2.3.3 Statistical arbitrage / Pairs trading
For statistical arbitrage trading, also referred to as pairs trading, the strategy lies in going
short in one exchange and going long on another. Several statistical methods are the basis
for this strategy, like Arbitrage Pricing Theory, which will be described later.
This strategy involves going long on assets that are relatively undervalued and going short
on assets that are relatively overvalued. When the spread, or measure of relative mispricing,
converges, a profit can be made by unwinding the position. Specifically, pair trading involves
simultaneously opening long and short positions in two correlated assets with a balance point
between them (Do et al., 2006). Regardless of bullish or bearish market conditions, this type
of strategy seeks to take advantage of market inefficiencies (Carrasco Blázquez et al., 2018).
While this strategy does not expose the arbitrageur to the risk of price fluctuations, a
drawback of this strategy is that one becomes exposed to the risk of expected price
convergence (Makarov and Schoar, 2020). This type of arbitrage trading equals a dollar-
neutral mean-reversion strategy, when both investment levels of shorting and going long are
the same (Kakushadze and Yu, 2019).
Asset Z
Asset YAsset X
End
Start
2.3 Crypto market
Despite the significant growth of the cryptocurrency market, it has remained largely
unregulated by government institutions. It is hypothesized that this unique, early-stage
environment, for the financial market, may present pricing inefficiencies that could be
identified and exploited through arbitrage trading (Fischer et al., 2019). To study the
possibilities of arbitrage and to implement an automated system for it, it is necessary to
examine the way in which the crypto market is organized. In general, crypto assets can be
acquired in two ways. By miners, who find new blocks by computationally calculating them,
which receive a reward in the respective crypto currency or by exchanging them for fiat
currencies or other cryptocurrencies on exchange platforms (Böhme et al., 2015).
2.3.1 Types of exchanges
To purchase assets of a market, one must utilize the services of an exchange platform.
These are institutions, which standardize assets and trading rules for multiple participants, in
contrast to over-the-counter (OTC) markets, where direct trading between two parties is
enabled. To enable these actions, an exchange must provide a number of services, some
which are like typical stock-exchanges acts and additional services crypto-exchanges have
to provide, which are often referred to alternative trading services (ATS). These ATS are
often a division of labour in traditional markets (Johnstone, 2019). The majority of exchanges
are similar to traditional stock exchanges in that they maintain the liquidity of assets and
determine the prices of assets through order book systems, which match the orders of
buyers and sellers. Orders are usually public information, allowing market participants to
gather information about interest in an asset and the price at which it is traded. For traditional
exchanges, like stock ones, orders are maintained by a central authority (Mohan, 2022). ATS
are commonly market making, contract counterparty, broking, dealing, advisory, custody and
for some also over-the-counter trading offers, which are not traditional exchange like
(Johnstone, 2019).
For the crypto market a differentiation must be made by centralized and decentralized
exchanges, where centralized exchanges (CEX) represent the majority of 99% (Goldenberg,
2018). A further distinction for CEX can be made by custodial and non-custodial, where the
second one distinct itself by the lack of user-wallet management, but still matching orders
through their internal system and taking fees of the top. The predominance still is custodial
for CEX, which means that there is a trusted central institution, which process orders,
maintain and secure assets and provide wallets for customers. This also means, that the
private keys of user wallets are held by the centralized exchange, not known to the user
itself, which puts user’s assets at risk, if a problem for the company running the exchange
occurs. Therefore, a single point of failure (SPOF) results, whereas decentralized exchanges
(DEX) imply a distributed risk with no SPOF. Centralized exchanges, are easier to
implement, provide better user-experience and functionality, but due to its unified design
suffer from several drawbacks. Some of these are the possibility of losing custody of assets
for users, single point of attack for hackers, subject to little regulation, lack of privacy or
mismanagement of the exchange operators (Mohan, 2022). Since every occurred hack, CEX
implemented new privacy methods, like storing a vastness of assets in “cold storage”, where
wallets are not connected to the internet, providing an extra layer of protection. Still hacks on
exchanges like Mt. Gox in 2014, Bitfinex in 2015, Coincheck in 2018 (Goldenberg, 2018) or
mismanagement like FTX in 2022 (Fu et al., 2022) led to a serious lack of trust in centralized
exchanges.
In CEX all transactions are processed through their servers, which settles user trades
immediately, in contrast to the typical transaction verification time of i.e. bitcoin, which takes
about 5 minutes to several hours. This is possible through enabling transactions off-chain,
where transactions are verified by the exchange itself, instead of recording and verifying
every order on the blockchain. The only transactions that are recorded on-chain on
centralized exchanges are withdrawals to external wallets or external deposits to internal
wallets (Pourpounehnajafabadi et al., 2020; Schär, 2020). In contrast a decentralised
exchange allows participants to exchange one asset for another without the need for a
centralised third party that is responsible for overseeing trading activity, while users remain
custody of assets and private keys (Mohan, 2022). DEX in their pure form fully take
advantage of blockchain technology. As a result, on-chain order books are used, which
means that all transactions and their corresponding verification is done by software, usually
the blockchain itself or by smart contracts. However, a user must pay for every update to the
order book and wait for the network to reach its consensus, which arises a less censored and
more trustworthy exchange, but also lower speeds and higher transaction costs
(Pourpounehnajafabadi et al., 2020). But also, DEX can be implemented in different ways,
with off-chain order books or automated market makers. Like with centralized exchanges, off-
chain order books can be used, where all orders are handled in a central manner, where only
the final confirmation of a transaction is verified by a smart contract on the blockchain. This
again results in improved speed performance and lower costs, although it requires more trust
as the order book is not confirmed each time.
A newer form of providing liquidity in DEXs is through automated market makers, which
operate without the traditional order book system and use algorithmic agents or smart
contracts instead. In contrast to supply and demand pricing, automated market makers pool
liquidity and set prices through a deterministic pricing mechanism. This removes the need for
counterparties but require arbitrageurs to remove price differences (Pourpounehnajafabadi et
al., 2020). It must be stated, that no matter how more customer-friendly the immediate
handling of user orders is, it dissolves the benefits a decentralized value-transfer network,
like the blockchain, was originally invented for. DEX are often limited by already mentioned
points above and liquidity and trading volume is lower compared to centralized exchanges,
but the trend is moving in their direction and centralized exchanges will head to a more
hybrid form with decentralized elements (Goldenberg, 2018).
To narrow down possible fiat currencies, which cryptocurrencies can be sold to, the base
currency of the exchanges country of operation must be used in the most cases. This results
from regulations for crypto exchanges that became increasingly stronger over the recent
years. Even if large exchanges operate across multiple regions, investors can only choose
the local currency as base currency of their respective countries, as order books are usually
held separately (Makarov and Schoar, 2020). In this thesis, the base currency will therefore
be EUR (Euro).
2.3.2 Fees
As a trader, or in the case of this thesis as an arbitrageur, there are a number of transaction
fees to be aware of when trading in the crypto market over the entire phase. That is of even
greater importance, when doing automated trades with usually just minor occurring profit
margins.
Following papers from traditional markets have shown that, after the strategy itself, the
trading costs associated with a particular strategy are the second most important determinant
of investment performance or stated as “Trading costs can substantially reduce the notional,
or ‘paper,’ return to an investment strategy” (Keim and Madhavan, 1997). As a result, it is not
surprising that empirical data from the fund industry has shown that the level of expenses is
negatively correlated with the net return on investment (Carhart, 1997). In general, the
majority of exchanges are striving to achieve lower fees for customers. This is based in the
theory that lower fees lead to a substantial increase in liquidity in terms of narrowing bid-ask
spreads, increasing depth and growth in trading volume (Malinova and Park, 2011).
Fees cannot be generalized for every crypto exchange, as they differ for some. However, it
can be summarized for the majority, that fees are applied for every action of selling, buying,
withdrawing, and sometimes also depositing (Kabašinskas and Šutienė, 2021). For a sell and
buy, most state them as maker and taker fees. Where maker fees are paid, when you add
liquidity to the order book by placing a limit order at or below the ticker price for a buy and at
or above for a sell. Taker fees are paid, when you remove liquidity from the order book by
placing any order that is executed against any order from the order book (“Bitfinex | Our
Fees,2023; Fee Rate,2023; Fee Structures | Explore our trading fees | Kraken,2023).
These maker and taker fees can be of a fixed rate like for instance Bitfinex, where they
charge 0.100% maker and 0.200% taker fees for crypto to crypto, crypto to stablecoin or
crypto to fiat transactions. On other exchanges, like on Kraken or Coinbase they can be
calculated on approximate terms, as the final charged fees are calculated at the time of
placing an order, and may be determined by a combination of factors, including but not
limited to the users location, selected payment method, asset, size of the order and market
conditions like volatility and liquidity (“Coinbase pricing and fees disclosures, 2023; Fee
Structures | Explore our trading fees | Kraken,2023). For the action of withdrawing, every
centralized exchange charges a fee, actually miner fees, as these transactions have to be
recorded on-chain on the respective blockchain, smart contract or protocol of the crypto
asset, which can also lead to varying amounts of costs. For a deposit of money on an
exchange, fees also vary. Where a transfer of cryptocurrencies or a standard bank transfer
are usually free, fiat deposits by credit card or PayPal come with costs.
Maker and taker fees for a single transaction can be therefore calculated per trade. Costs for
withdrawals or deposits must be analysed overarching. Maker and taker, or bid and ask, fees
for trading the quantity
8 9 :,
on exchange i can be calculated by the following formulas
(Hautsch et al., 2018). Where B (Bid) or A (Ask) is the respective asset price and
;
!#$
-
<
/
9 :
and
;
!#%
-
<
/
9 :
.
=
!
&
-
8
/
,+,=
!
&
,-0.;
!#$
-
8
/
/
>
!
&
-
8
/
,+,>
!
&
,-0.;
!#%
-
8
/
/
2.3.3 Price formation
How prices of crypto assets are formed, can be explained best by examining the biggest and
most popular one, Bitcoin. As there are now thousands, not all function the same: some are
backed by an asset in the real-world and some do not, some have a fixed supply and some
do not, for some, the entire supply is already available, for others it is calculated on an
ongoing basis. For the example of Bitcoin, it is a virtual currency with zero intrinsic value
issued by the blockchain and not backed by a real-world asset or a government. Additionally,
supply is fixed with 21.000.000 coins, but not all is already accessible because new bitcoins
get mined on a continuous basis (Bouoiyour and Selmi, 2015). Two papers confirm that
market forces of supply and demand have one of the highest impacts on the price formation
of Bitcoin (as of any other currency), which importance tends to increase over time. Its
scarcity on the market determines the number of units in circulation. Demand is mainly driven
by the demand of transactions as a medium of exchange for goods and services.
Consequently price movements can be explained by the interactions of supply and demand
(Bouoiyour and Selmi, 2015; Buchholz et al., 2012).
For the example of Bitcoin, the supply is exogenous, so it has no relationship to demand or
price. The observed price changes are due to shifts in demand, because supply does not
change in response to price. Therefore, the intersection between supply and demand should
continuously move down the demand curve, because the quantity of Bitcoin is increasing
over time due to miners. The demand curve is truly a horizontal line as any change in
quantity is fully expected. Hence all observed price fluctuations should be due to shifts in
demand, as supply should not affect the price of bitcoins in dollars over time (Buchholz et al.,
2012). Following figure visualise this example:
Figure 3: All observed price fluctuations occur due to shifts in demand (Buchholz et al., 2012).
Moreover, several features are missing for most cryptocurrencies from fiat currency supply
and demand, which normally form the basis for its price. For this reason, the price formation
cannot be explained by common economic theories such as the future cash flow model,
purchasing power parity or uncovered interest parity. Also, since Bitcoin is not issued by a
central bank or government, it is detached from the current economy, which implies that
there are no macroeconomic fundamentals that could determine its pricing (Bouoiyour et al.,
2014; Kristoufek, 2013). Additionally, the arrival of new information, new posts and increase
in search results on the internet have a positive impact on Bitcoin price in the short run. This
is also associated with incoming speculative investors, affecting the price, and providing
liquidity to the market. In combination with the impact of increasing information on the
internet this leads to the downside in the short run by increasing price volatility and creating
price bubbles (Ciaian et al., 2016; Kristoufek, 2013). In the short, but not on long term, there
is also a significant influence of global macro-financial development, captured by the Dow-
Jones Index, exchange rate and oil-price (Ciaian et al., 2016).
To further understand how prices are set, one must differentiate between asset price
sources, like one of the most famous crypto comparison platforms coinmarketcap.com, and
exchange specific prices. Prices vary, because exchanges are not connected and calculate
prices based on their volume of trades and buy and sell activity, therefore respective supply
and demand of their users. The more trading operations and volume the exchange
processes, the more market relevant prices exist. News services like Google use an
aggregated price model and cointelegraph.com or already mentioned coinmarketcap.com
use own price indexes, which calculate asset prices by using an average value based on the
prices of top exchanges (Egorova, 2018). As prices for crypto assets on exchanges differ, it
can be concluded that arbitrage opportunities exist. However, not every price discrepancy
presents a profitable arbitrage opportunity.
2.3.4 Order Books and corresponding definitions
Order books are in use of most crypto exchanges, specifically Limit Order Books (LOB),
which are a collection of limit orders at which market participants are willing to buy or sell.
Centralised exchanges (e.g. Binance, Kraken, Coinbase) are based on Limit Order Books,
whereas decentralized exchanges (e.g. Uniswap, Pancakeswap, Sushiswap) rely on an
Automated Market Maker protocol (Barbon and Ranaldo, 2021). Also the vast majority of
traditional stock exchanges use a LOB or hybrid LOB system to facilitate trading (Gould et
al., 2013). Due to this similarity of centralised and stock exchanges, trading strategies, like
arbitrage trading in this thesis, can be compared on a proper basis.
To understand how arbitrage opportunities can be utilised, one must examine how they occur
and how the trading mechanism of exchanges work. Therefore, those of Limit Order Books
will be investigated in this chapter. LOB’s act in a flexible way, where every trader has the
possibility of submitting buy or sell orders. When a buy or sell order
?
is posted on an
exchange, a trade matching algorithm checks if a previously submitted buy or sell order can
be matched. If this is possible, the trade executes immediately, otherwise
?
becomes active
and remains on that status until it becomes matched to an incoming buy or sell order, or it is
cancelled. Cancellation usually occurs when an exchange platform terminates active orders
after a certain time, to prevent an overly large accumulation of active orders. Otherwise the
owner of an order cancels it themselves, if they do not longer wish to do a trade at the stated
price (Gould et al., 2013).
Following definitions, need to be understood or are used for calculations in further methods
of this thesis, for LOB’s from (Gould et al., 2013) are defined as:
§ An order is defined with price
;
'
@ :
and size of
A
'
@ :,67,A
'
B :
is a commitment to
sell/buy up to
A
'
units of a traded asset at time
!
'
.
A, + -;
'
5A
'
5!
'
/
§ The resolution parameters for a LOB’s order are defined as the lot size
C
, which is the
smallest amount an asset can be traded within it,
A
'
,D,
E
FGC,
H
,G, + ,05I5JJJK
and the tick size
L
, which is the smallest possible interval of a price, also called
accuracy.
§ A LOB
M-!/
is a composition of all active orders in a market at time
!
.
§ A LOB can be considered as a set of queues of active buy orders
=
-
!
/
, for which
A
'
B :
, and active sell orders
>-!/
, for which
A
'
B :
.
§ The depth of available orders at price
;
at time
!
is defined for the bid-side as
N
(
-
,;5!
/
O,
P
A
'
)'"*"$+&,-".
!
"/".0
and for the ask-side as
N
1
-
,;5!
/
O,
P
A
'
)'"*"% +&,-".
!
"/".0
§ The bid price is the highest stated price among active buy orders at time
!
.
'-!/ Q+, RST
'*$+&,
,;
'
§ The ask price is the lowest stated price among active sell orders at time
!
.
U-!/ Q+, RVW
'*%+&,
,;
'
§ The bid-ask spread is the difference between the ask and bid price at time
!
.
$
-
!
/
+ U
-
!
/
.'-!/
§ The mid-price at time
!
is the calculated middle price between the ask and bid price.
X
-
!
/
Y+
Z
U
-
!
/
['
-
!
/
\
I
§ Sometimes it is better to compare orders by relative price, where the bid-relative price
is
]
(
-,;/ Q+ ,'-!/,
^
,;
and the ask-relative price is
]
1
-,;/ Q+ ,U-!/,
^
,;
.
Figure 4: Schematic functionality of a Limit Order Book System (Gould et al., 2013)
In a LOB, at time
!
, the maximum price to sell at least the lot size of the traded asset
immediately is represented by the bid price
'-!/
, while the minimum price to sell at least the
lot size of the traded asset immediately is represented by the ask price
U-!/
. In addition to
buy and sell orders, a further distinction has to be made between submitting limit and market
orders. While limit orders have the possibility of matching at better prices, they also are at
risk of never being matched and remaining in the active queue until cancellation. In contrast,
market orders do not face the uncertainty associated with limit orders, although they never
match at prices better than bid price
'-!/
or ask price
U-!/
(Gould et al., 2013). Specifically, a
market order is a request for immediate trading at the best price currently available in the
market (Parlour and Seppi, 2008). The bid-ask spread
$
-
!
/
can be considered as a measure
of a market’s assessment of the value placed on immediacy and certainty associated with
market orders versus the waiting and uncertainty of the completion of limit orders (Gould et
al., 2013).
LOBs are popular because they allow some traders to demand immediacy, whilst allowing
others to provide it to those who require it later. Arbitrageurs, technical traders and indexers,
whose activities are fast and often automated, will most likely submit market orders, whereas
portfolio managers, whose focus is on long-term investments, will submit limit orders
(Foucault et al., 2005). This results out of an arbitrageur’s strategy of simultaneously buying
and selling an asset in an attempt to make instant profit. As they calculate with a certain
price, when an order is submitted, the uncertainty of a matching limit orders (if ever), is of
little use to them.
2.3.5 Theoretical concepts for Arbitrage Trading
The efficient market hypothesis, along with the arbitrage pricing theory and the capital asset
pricing model, have been crucial in understanding movements in the financial markets
through mathematical models. However, their applicability is often challenged by real world
data (Weron and Weron, 2000). This highlights the need for a broader perspective on the
applicability of these theoretical concepts to better understand potential discrepancies and
inefficiencies in financial markets. This chapter examines the efficient market hypothesis,
arbitrage pricing theory and the law of one price. Finally, a conclusion is drawn on the
potential for arbitrage opportunities.
2.3.5.1 The Efficient Market Hypothesis
The efficient market hypothesis (EMH) progressed to being one of the most dominant
paradigms in finance and has proved itself as an important and widely accepted fact of life in
the literature of finance, accounting, and economics of uncertainty (Keim and Madhavan,
1997). It is generally accepted that Bachelier (1900) was the first to discover, that security
prices follow Brownian motion and irregular random walk in speculative financial markets,
which implies that investors cannot get any excess return by detecting price fluctuations. This
assumption can also be expressed with the statement of economics, that there is no such
thing as free lunch (Liu et al., 2022). Fama (1970), was the first to formally propose the
efficient market theory, which is based on this theory, and stated that a market is said to be
efficient if transactions are at its correct value, because all available information is always
included in the price.
According to the efficient market hypothesis, an ideal market is one where prices provide
accurate signals to allocate financial resources under the assumption that asset prices fully
reflect all available information at any time (Fama, 1970). Jensen (Jensen, 2002) makes this
hypothesis dependent on the amount of information available or given about the market.
Where q
t
represents
the given information and economic profits are considered as the risk
adjusted returns net of all costs:
A market is efficient with respect to information set q
t
if it is impossible to
make economic profits by trading on the basis of information set q
t
.
The EMH has been discussed in different forms in studies with varying results, resulting in
the formation of three categories of the hypothesis, which are mainly distinguished by the
given information set q
t
(Fama, 1970; Jensen, 2002):
1. The weak form of EMH is referred to, where the information set q
t
is all that is
included in the past price history of the market as of time
t
.
2. The semi-strong form of the EMH is referred to, where q
t
is all that is publicly
available at time
t ,
which also includes past prices of the weak form.
3. The strong form of the EMH is referred to, where q
t
is all information that is known to
anyone at time
t .
Version three is an extreme form, which is a logical completion of the first two, but not a
realistic representation. When literature refers to the efficient market hypotheses, they
normally represent the second version.
2.3.5.2 Arbitrage Pricing Theory
Arbitrage pricing theory (APT) was developed, in 1976 by Ross, as an alternative to the
popular capital asset pricing model for the explanation of asset or portfolio returns. The
purpose was to determine the fair value of an asset (Ross, 1976). It assumes that equity
returns can be predicted using a linear model of several systematic risk factors. The
business cycle, changes in inflation, changes in interest rates, changes in exchange rates
and so on are described as economic risk factors. The APT could indicate that the asset is
either overvalued or undervalued if the current value differs from the calculated value. For
example, if the arbitrage pricing model valued a stock at 300, but the current market price of
the stock is 250, the stock would be considered overvalued. According to APT, the price of
the stock should eventually correct itself, creating an arbitrage opportunity (Ross, 1976).
Unlike pure arbitrage, which places constraints on prices that are only observed at a
particular moment in time, APT attempts to explain expected returns at different points in
time. Therefore, APT is mainly of use for statistical arbitrage, which will be explained later,
but has just a minor advantage for pure arbitrage (Poitras, 2010).
2.3.5.3 Law of one Price
The law of one price is an economic cornerstone and says that in fully competitive markets,
identical goods and services must sell for the same prices. This price then represents an
equilibrium between supply and demand. The law states that the price of identical goods and
services should be the same, in a perfectly competitive market, due to competition, where
there are many buyers and sellers and no barriers to entry. If, for example, different prices
exist in a market for a particular product, buyers would tend to buy only from the supplier who
offered the lowest price. This would force the other suppliers to lower their prices to remain
competitive, ultimately leading to a single price. In addition, a market with different prices
would lead to the emergence of arbitrageurs, which in turn would lead to a standardisation of
prices (Isard, 1976).
2.3.5.4 Chances for Arbitrage Opportunities
As in theory, financial markets should be in economic equilibrium, or move towards it, it
excludes the possibility of “making money out of nothing” or already mentioned “free lunch”.
Therefore, arbitrage opportunities should not occur and in the essence of no-arbitrage of
mathematical finance, the existence is also unrealistic. In addition, all mathematical models
of financial markets have to satisfy an arbitrage-free condition to be realistic models
(Fontana, 2015). In theory, economic equilibrium is a state of balance of market forces, a
concept borrowed from the physical sciences, where observable physical forces can balance
each other. Because of the dynamic and uncertain nature of the conditions underlying supply
and demand, it is a fundamentally theoretical construct that may never happen in an
economy. Consequently, the economy is in pursuit of equilibrium without ever actually
achieving it (Jofre et al., 2014). As the quality in which markets operate vary strongly and do
not meet the conditions of financial theory, opportunities for arbitrage occur. According to the
efficient market hypothesis and given that it holds, arbitrage opportunities should not be
available for assets cross-listing on multi-markets, like crypto assets across exchanges.
However, it must be taken into account that these mentioned hypotheses above were
developed on the basis of traditional financial markets. Since the crypto market varies in
many key characteristics from the traditional markets, four aspects were found, which
influence opportunities for arbitrage differently. First, considering that price formation
happens in each exchange themselves, and fiat currency, cannot flow seamlessly across
regions, price fluctuations and formations in individual markets do not reflect cross-market
information in a timely manner. This type of friction is preventing markets from forming a
consensus, so price disparity between markets will not disappear and is inevitably linked to
market efficiency failures (Duan et al., 2021; Makarov and Schoar, 2020). These findings
were found for Bitcoin, but as these observations are consistent with the evidence from
existing financial markets and the crypto market orients itself at the largest cryptocurrency
and other assets are treated similar in exchanges, it will be therefore assumed, that these
findings hold also for other crypto assets (Duan et al., 2021). Therefore, two papers conclude
that arbitrage opportunities exist in inner- and cross-markets, wherever these occasions are
greater across, than within regions (Duan et al., 2021; Makarov and Schoar, 2020). Second,
crypto assets are completely identical across exchanges and countries, in difference to
stocks and bonds which can differ. Third, crypto markets operate 24 hours a day and seven
days per week, with continuously available pricing data (Dwyer, 2015). Fourth, the crypto
market is unique in that there is no government regulation, such as the US Securities and
Exchange Commission's National Best Bid and Offer regulation, which allows traders to get
the best possible price by comparing prices from different markets (Makarov and Schoar,
2020).
One of those papers also combined cross-market arbitrage opportunities with market
efficiency. The potential for cross-market arbitrage can be closely related to the share of
active arbitrageurs in each market and to their migration behaviour from one market to
another. A market with a high level of arbitrage activity processes new information more
quickly than a market with a low level, thereby increasing market efficiency. In addition, a
sudden change in arbitrage opportunities can result in arbitrage migration between markets,
leading to changes in market efficiency. Conversely, changes to market efficiency can lead to
arbitrage migration, which can lead to price dispersion between markets. This does not
necessarily require physical migration of arbitrageurs, only a change from becoming active to
inactive or vice versa. For example, if arbitrageurs are active only when markets are
relatively inefficient and inactive when markets are relatively efficient, a similar effect of
migration occurs (Duan et al., 2021).
In general, arbitrage opportunities arise in low quality markets. Therefore, this factor will be
examined in more detail. This hypothesis is supported by the positive relationship between
liquidity and arbitrage activity, which therefore improves market efficiency (Chordia et al.,
2008). In addition, three further papers found that liquidity and market efficiency are
positively correlated in the crypto market (Al-Yahyaee et al., 2020; Brauneis and Mestel,
2018; Wei, 2018). Therefore, market quality will be addressed by focusing on market liquidity
and efficiency.
2.3.5.5 Liquidity
As the definition of liquidity vary, it is difficult to define formally. Black (1971) describes that a
market for a stock is liquid, if the following conditions hold. First, for a trader who wants to
buy or sell small amounts of shares immediately, there are always bid and ask prices
available. Second, the bid and ask spread is at its smallest. Third, an investor who buys or
sells many shares can, in the absence of specific information, expect to do so over a long
period of time at a price, which on average does not differ significantly from the current
market price. Fourth, a trader can buy or sell large blocks of securities immediately, but at a
premium or discount that depends on the size of the block. The larger the blocks, the greater
the premium or discount.
Building upon these terms, Kyle (1985) identified tightness, depth and resilience as the three
key characteristics of a liquid market. Where tightness, refers to the cost of turning around a
position over a short period of time, as it should be costless to unwind a position in a
perfectly liquid market. Depth refers to a market's ability to absorb order volumes without
significant market impact. This refers to one of the already mentioned definition of order
books, when there are many market and limit orders at prices around the last trade, a market
is said to have depth. Resilience is a measure of the speed of price recovery from a random,
uninformative shock.
2.3.5.6 Efficiency
For efficiency, one must determine, between markets being efficient by prices or information.
As the EMH got already discussed, it can be extended by stating that in financial markets the
concept of informational efficiency, refers to the ability of markets to integrate unexpected
news quickly and accurately into current prices. The EMH suggests that prices should exhibit
a random walk pattern, resulting in unpredictable price movements over time, if new
information is correctly and quickly incorporated into asset prices. Therefore, the random
walk is a is property of a perfectly efficient market, where tests for it have been used to
assess the informational efficiency of markets and, in so doing, to test the EMH (Ozenbas et
al., 2022). Kühl (2010) also shows that the inefficiency of markets cannot only originate from
an individual market or in one exchange, which is the common approach, but also from a
cross-market (or cross-exchange) development.
3 Methods
In the context of this thesis, the overarching research question "How are traditional trading
strategies of financial markets, such as arbitrage trading, also applicable with assets on the
crypto market in an automated way?" will be answered. To achieve this goal, this work is
divided into three methods, each with its own research question. In the first, the necessary
data and information will be collected and acquired. In the second, a concept is further
developed to find suitable crypto assets and exchanges that have an increased arbitrage
possibility. In the last method, a prototype for arbitrage trading will be implemented and
tested.
To achieve this goal, Literate Research and the scientific method of Design Science
according to Alan Hevner (2004) are used. With Design Science, artifacts are formed, by
using an iterative cycle of analysis, implementation, and evaluation of a created model.
Therefore, it is a method of addressing research problems by creating and testing artifacts
specifically designed to address a specific business requirement. Design Science consists of
three research cycles, which are the relevance, the design and the rigor cycle. The relevance
cycle connects the project environment with the design science activities. Rigor cycle
connects design science activities with experience, expertise and scientific foundations. The
central design cycle iterates between the research processes and building and evaluating the
design artifacts. An artefact is an answer to a specific research problem. It is tested to see
how well it is actually suited to solve the initial problem (Hevner and Chatterjee, 2010).
Other considered possible scientific methods have been case studies or expert interviews.
As there are no accessible and performant arbitrage trading software available to study in
detail, except some open-source bots, it is not shortlisted, due to various reasons. Since this
involves trading software, which can generate significant amounts of money, larger
companies or investors who use such technology try to keep it for themselves. Therefore, the
method of conducting case study research is not appropriate. Expert interviews were also
considered, which would shorten method one and two, but since literature provides the
needed basic information, an own concept for filtering assets based on historical data has
been worked out. Known principles from the literature were thus applied to data, with it being
regarded as the single point of truth. Expert interviews, therefore, would not have produced
the same conclusive results and would not have added any substantial benefits.
3.1 Data collection & management
In the first method, the research question of which information must be gathered to enable
decision making for arbitrage trading opportunities, will be answered. To achieve this goal,
data has to be collected from exchanges and knowledge from literature gathered.
This method is separated into two steps. In the first, the knowledge base must be built up,
which information is needed according to current literature regarding crypto assets, as well
as exchanges to detect arbitrage. The most crypto exchanges are governmentally regulated
nowadays (Kabašinskas and Šutienė, 2021). Therefore, some exchanges will cease, and it
can be limited to the possible ones. In addition, the exchanges that will be used are
centralised exchanges, as they are, for the scope of this thesis, easier to deal with and they
have almost no validation time, unlike decentralised exchanges where a set trade is not
certain to be executed at that price (Hautsch et al., 2018).
In addition, there are fees that must be paid for trading assets against each other, as well as
deposit and withdrawal fees on centralized exchanges. Some of these information can be
collected from crypto market utility sites, which aggregate and report data of different
exchanges or assets. If these are not available, data has to be mined or aggregated
independently from websites of the respective exchanges.
In the second step, pricing data about diverse crypto assets will be collected. The highest
possible resolution is tick-level, which is intraday data and represents a series of executed
trades order bid/ask quotes from different exchanges. For the scope of this thesis, OHLCV
data fits best, as data is needed, at which trades were actually done. This format of historical
price data is retrieved per timeframe, which should ideally be in minutes. There are several
platforms that offer this data on the internet, which can then be joined based on timestamps.
However, these providers usually charge high fees. If reliable data is found from such a
source, it will be aggregated this way. Otherwise, they will have to be scraped directly from
exchanges, over their public Application Programming Interfaces (APIs), where an open-
source library will be used. The price information and the additional information regarding
exchanges and assets are then brought together, cleaned, and prepared for further analysis
in the second method.
3.2 Concept for crypto asset filtering
In the second method, the specific research question of what requirements and criteria for a
crypto asset are to be considered for arbitrage trading systems, will be answered. This
concept will be the basis for the part of the arbitrage-opportunity detection system of the next
method, based on the gathered data in the first method.
Through literature research, this method aims to understand, how arbitrage occurs, how it
can be detected mathematically and what indications there are for arbitrage for assets,
markets or exchanges. The arbitrage index, which is also used in other papers, is applied
and price differences are calculated and visualised. With this knowledge base, a concept for
filtering crypto assets by suitability for arbitrage trading will be developed using design
science. Therefore, crypto assets which are suitable and have an increased potential of
arbitrage opportunities, will be filtered on the gathered data and information from the
previous method. This data analysis will be conducted with use of Python and its associated
packages.
3.3 Arbitrage trading prototype
In the third method the specific research question of how an arbitrage trading strategy can be
realized in the crypto market as a software prototype, will be answered. To achieve this goal,
a system that can identify arbitrage opportunities, calculate profits and simulate trades, will
be designed and implemented. Within the scope of this thesis, only paper trading is done,
which means, that the triggered function of executing a trade, logs it to a local database,
incorporating trading fees, to simulate a real trade.
Design science according to Alan Hevner will also be used in this method. To better relate
this to the use case of implementing a development prototype, the more specific Systems
Development Framework in Information Systems Research will be used, which consists of
five stages.
Figure 5: System development research model according to Design Science by Hefner. Adopted from
Nunamaker (Hevner and Chatterjee, 2010)
In the first one, a conceptual framework is constructed by studying relevant disciplines
through literature research and based on the method’s research question. Further the system
architecture is created by developing a modular and extensible architecture and defining the
functionalities of components and their interrelationships. In the third stage, the system is
Construct a
conceptual
framework
Develop a
system
architecture
Analyze &
design the
system
Build the
prototype
system
Evaluate the
system
analysed, and a process designed to carry out the system functions. In the next step the
prototype is developed, according to the previous steps. In the fifth and last stage the system
will be observed and evaluated by observing the use of the system by case studies (Hevner
and Chatterjee, 2010).
For this method, a software tool is selected, and a technical concept of the arbitrage trading
system is planned. Followed by development of the prototype and continuous testing and
improvements.
4 Data collection & management
In this method, the research question of which information have to be gathered to enable
decision making for arbitrage trading opportunities, will be answered. First, reliable sources
for historical data of crypto asset prices per exchange are searched for. In the next step,
diverse top exchanges are considered, data is mapped into the same format and saved as
files. In the last step, data is analysed and processed to enable correct results.
4.1 Data sources
There are various sources to gather historical data of crypto asset prices from exchanges.
Possibilities are digital asset providers like kaiko.com, crypto utility sites like
coinmarketcap.com, cryptocompare.com, coingecko.com, exchanges themselves or open-
source libraries. In the context of this thesis, historical data will be collected via cost-free
approaches. Needed are information about crypto assets, exchanges and respective OHLCV
data, best per minute. This abbreviation stands for open, high, low, close price and volume
per time interval. The desired period for the analysis of the historical data was chosen from
January 1
st
, 2022 to April 1
st
, 2023, therefore over the last 16 months.
Digital asset providers like kaiko.com, offer high quality data of exchanges in formats like
tick-level, order book and OHLCV in different intervals. As these sources are the most
convenient and reliable, also other papers like of Makarov and Schoar (2020) used them.
The disadvantage of those is that they have a high price, as this is their main business. In the
example of kaiko.com, exact information about prices is only available upon request. There is
also access for academic institutions, but this must be purchased by the respective
institution. For this reason, this data source is obsolete for this thesis.
Crypto utility sites like the most popular coinmarketcap.com, cryptocompare.com or
coingecko.com, offer diverse data about assets and exchanges. These are available on the
websites of the platforms or accessible via their own APIs. These APIs provide both cost-free
and paid access to data, with some differences in terms of features, limitations, and data
types. Free access in general offers basic data such as coin lists, market data, historical
information and social data. In varying limited forms also historical data. In contrast, paid
plans offer more comprehensive and detailed data, including historical order book records,
in-depth market analysis, detailed historical data, initial coin offering (ICO) information,
enabling deeper insights.
The information of assets and exchanges required for this thesis can therefore be accessed
free of charge via these APIs. The best free access to historical data in OHLCV format is
provided by cryptocompare.com. In time intervals per minute, however, only for the last day
and per hour for three months per call. The desired period over the last 16 months, in minute
intervals, would only be possible with an annual subscription of around 4.500, similar to the
other providers for the same results (“CoinGecko API Pricing Plans,” 2023; “Pricing |
CryptoCompare API,” 2023; CoinMarketCap, 2023). However, by writing a script in python,
which executes several requests with time intervals of three months via the free version of
the CryptoCompare API, a similar result can be achieved. This is considered as the main
data source in this thesis, also due to the high number of available exchanges.
Exchanges themselves also offer historical data over their APIs. Unfortunately, this option is
difficult to access without registering on each exchange due to the limited documentation of
their APIs. Some exchanges like Binance, also offer historical data as download per currency
pair as zip file. As this option is rarely available on other exchanges, these two sources are
not shortlisted due to the high complexity and effort involved.
To solve this complexity of data collection from individual exchanges directly, an open-source
library can help. This is called CCXT (“CCXT CryptoCurrency eXchange Trading Library,”
2023) and is used to connect and trade with cryptocurrency exchanges and provides quick
access to market data. While it provides a direct connection to the exchange's public APIs,
these interfaces are easily accessible through the library's provided functions. Due to these
masked, but direct API connections there is still the problem of exchange’s rate limits (“CCXT
- Documentation,” 2023). Since the desired historical data is usually not available via one
request, several must be made again via a script. However, the public APIs of exchanges are
quite sensitive to those limits, which results in very long time periods for retrieving data,
unless the own IP address is not blocked temporarily or for a longer period of time. CCXT
offers connections to 101 exchanges, as of April 2023 (“CCXT CryptoCurrency eXchange
Trading Library,” 2023). This source is considered as the second data source for historical
data of this thesis.
It should be noted that OHLCV data in minute intervals would have been the desired time
format, which most other papers also use. Due to the constraints of the APIs, it is necessary
to work with hourly data, which is the first limitation in the scope of this thesis.
4.2 Assets and Exchanges
All known papers, which have dealt with arbitrage opportunities in the crypto market, have
only addressed top crypto assets, such as Bitcoin or Ethereum. It is known from literature
research that arbitrage opportunities are more likely to arise when liquidity is lower and
assets are not traded enough, that price differences close quickly or do not arise at all. For
this reason, the selection is extended the top 100 assets by market cap on
coinmarketcap.com and cryptocompare.com.
These are the 100 assets, which are considering in the beginning, with abbreviation and full
name:
ETH - Ethereum
USDT - Tether
BNB - Binance Coin
USDC - USD Coin
BTC - Bitcoin
XRP - XRP
ADA - Cardano
DOGE - Dogecoin
MATIC - Polygon
SOL - Solana
DOT - Polkadot
LTC - Litecoin
SHIB - Shiba Inu
BUSD - Binance USD
AVAX - Avalanche
TRX - TRON
DAI - Dai
WBTC - Wrapped Bitcoin
LINK - Chainlink
UNI - Uniswap
ATOM - Cosmos
OKB - OKB
LEO - UNUS SED LEO
ETC - Ethereum Classic
XMR - Monero
TON - Toncoin
XLM - Stellar
FIL - Filecoin
BCH - Bitcoin Cash
APT - Aptos
LDO - Lido DAO
TUSD - TrueUSD
ARB - Arbitrum
HBAR - Hedera
NEAR - NEAR Protocol
VET - VeChain
CRO - Cronos
ICP - Internet Computer Protocol
APE - ApeCoin
ALGO - Algorand
GRT - The Graph
QNT - Quant
FTM - Fantom
EOS - EOS
STX - Stacks
MANA - Decentraland
AAVE - Aave
THETA - Theta Network
IMX - Immutable
EGLD - MultiversX
FLOW - Flow
XTZ - Tezos
AXS - Axie Infinity
CFX - Conflux
SAND - The Sandbox
BIT - BitDAO
RPL - Rocket Pool
USDP - Pax Dollar
CHZ - Chiliz
NEO - NEO
KCS - KuCoin Token
OP - Optimism
CRV - Curve DAO
KLAY - Klaytn
GMX - GMX
MKR - Maker
LUNC Terra Classic
FXS - Frax Share
USDD - USDD
SNX - Synthetix
MINA - Mina Protocol
BSV - Bitcoin SV
ZEC - Zcash
CAKE - PancakeSwap
DASH - Dash
INJ - Injective Protocol
HT - Huobi Token
MIOTA - IOTA
XEC - eCash
RNDR - Render Token
XDC - XDC Network
GT - GateToken
BTT - BitTorrent
WOO - WOO Network
RUNE - THORChain
PAXG - PAX Gold
CSPR - Casper
AGIX - SingularityNET
LRC - Loopring
TWT - Trust Wallet Token
ZIL - Zilliqa
1INCH - 1inch Network
FLR - Flare
CVX - Convex Finance
KAVA - Kava
DYDX - dYdX
ETH - Ethereum
USDT - Tether
Table 3: Top 100 assets by market cap considered at the beginning with abbreviation and full name.
These are the 85 available exchanges to gather data from cryptocompare.com and CCXT
used on both sources, with abbreviation and full name:
aax - AAX
ABCC - ABCC Exchange
ataix - ATAIX
bequant - Bequant
Bibox - Bibox
BigONE - BigONE
Binance - Binance
binanceusa - Binance US
Bit2C - Bit2C
BitBank - BitBank
BitBay - BitBay
bitbuy - Bitbuy
Bitfinex - Bitfinex
bitFlyer - bitFlyer
bitflyereu - bitFlyer EU
bitflyerus - bitFlyer US
Bithumb - Bithumb
bithumbglobal - Bithumb
Global
Bitkub - Bitkub
BitMart - BitMart
Bitpanda - Bitpanda
Bitso - Bitso
Bitstamp - Bitstamp
BitTrex - Bittrex
blockchaincom -
Blockchain.com
BTCBOX - BTCBOX
BTCMarkets - BTC Markets
BTCTurk - BTCTurk
btse - BTSE
bullish - Bullish
CBX - CBX
Cexio - CEX.IO
Coinbase - Coinbase
Coincheck - Coincheck
CoinCorner - CoinCorner
CoinEx - CoinEx
coinfield - CoinField
CoinJar - CoinJar
Coinmate - Coinmate
Coinone - Coinone
Coinsbit - Coinsbit
crosstower - CrossTower
cryptodotcom - Crypto.com
currency - Currency.com
dcoin - Dcoin
decoin - Decoin
DigiFinex - DigiFinex
eidoo - Eidoo
erisx - ErisX
etoro - eToro
Exmo - Exmo
ftxus - FTX US
Gateio - Gate.io
Gemini - Gemini
gopax - GOPAX
HitBTC - HitBTC
huobijapan - Huobi Japan
huobikorea - Huobi Korea
HuobiPro - Huobi Global
IndependentReserve -
Independent Reserve
indodax - Indodax
itBit - itBit
Korbit - Korbit
Kraken - Kraken
Kucoin - KuCoin
LAToken - LATOKEN
Liquid - Liquid
lmax - LMAX Digital
Luno - Luno
Lykke - Lykke
NDAX - NDAX
nominex - Nominex
OKCoin - OKCoin
OKEX - OKEx
P2PB2B - P2PB2B
Paymium - Paymium
probit - ProBit
TheRockTrading - The Rock
Trading
Upbit - Upbit
valr - VALR
Vaultoro - Vaultoro
Zaif - Zaif
ZB - ZB.com
ZBG - ZBG
zebitex - Zebitex
Table 4: 85 available exchanges from cryptocompare.com and CCXT with abbreviation and full name.
As already mentioned, only assets that are traded against the fiat currency Euro are
considered in this paper, so that no additional currency exchange rate conversions must be
incorporated. Therefore, a first selection of assets and exchanges takes place on its own,
which will already reduce the considered assets by half. This happens because not all
exchanges support trading against EUR. Secondly, not all exchanges that do, support the
same asset pairs. In addition, it can be expected that a few exchanges will provide entirely
false data or only for certain asset pairs, which will then be removed manually in the step of
data processing.
As mentioned earlier in this thesis, centralized exchanges come at risk of losing custody of
assets for users, single point of attack for hackers, subject to little regulation, lack of privacy
or mismanagement of the exchange operators (Mohan, 2022), a basic overview of
considerable exchanges has to be done. To get a shortlist of centralized exchanges, which
can be considered as trustworthy, a comparison is done between the top 25 exchanges of
coingecko.com, cryptocompare.com, coinmarketcap.com and kaiko.com, by their exchange
score. These are ratings depending on different criteria, varying slightly per platform and are
called Trust Scores, Exchange Scores or Points. This process included, scraping the first 25
rows of the ranking table of each website. Second, the results were concatenated per
exchange with each rating per platform. In the next step, every exchange was excluded,
which was not present in at least three top 25 rankings by platform. Additionally, exchanges
were excluded, which are not accessible on the European market. Since the location of the
realisation of this thesis is Austria, Austria's best-known crypto exchange Bitpanda (Bitpanda
Pro, as this is the product where trading over an API possible) is added.
13 exchanges remain from this selection of the top 25 according to the respective ranking of
the platforms. These are seen as preferred exchanges in the context of this thesis:
Figure 6: Preferred exchanges, resulting from top 25 exchange rankings by platforms trust score.
Scraped Websites with exchange ranking by Trust Scores, Exchange Scores or Points:
Address of Exchange Ranking
Ranked by
Accessed
www.coinmarketcap.com/rankings/exchanges/
Score
10/04/2023
www.cryptocompare.com/exchanges/#/overview?f2=Centralized
Points
10/04/2023
www.coingecko.com/en/exchanges
Trust Score
10/04/2023
www.kaiko.com/pages/exchange-ranking
Kaiko
Exchange
Score
10/04/2023
Table 5: Scraped Websites with exchange ranking by Trust Scores, Exchange Scores or Points.
4.3 Collection of data
As previously mentioned, OHLCV data per minute is not available free of charge from the
main data source cryptocompare.com. For this reason, it is only possible to use hourly
intervals as the best possible data granularity. From the library CCXT, which gets data from
exchanges public API, retrieving OHLCV data in minute intervals was possible, but only from
Binance, Bitstamp and Bitvavo correctly. Nevertheless, it was rarely the case that all three
exchanges provided minute interval data for the same crypto asset over the desired 16
months. Additionally, CCXT delivered in general a lower number of exchanges, compared to
cryptocompare.com, which should in fact be the same, as identical exchanges offer the same
trading pairs. Due to this reason
cryptocompare.com is used as the main data
source and if data is incorrect, it is substituted
with data from CCXT, if available.
The gathered data is saved in the file
structure as seen in Figure 7. In a main data
directory, the script creates a folder with the
name historical_asset_EUR”, where asset is
the asset, which is traded against the fiat
currency Euro, which represents “EUR”. Per
asset folder, the OHLCV data from the
exchanges are saved, which offer the desired
trading pair. Those are stored as
exchange_asset_h.csv”, where exchange
stands for the respective trading platform and
h” for the data in hourly format.
Data within these exchange files is saved in
OHLCV format, which stands for Open, High,
Low, Close price and Volume per time
interval. Cryptocompare.com delivers volume
divided into “Volume from” and “Volume to”,
where the first, stands for the number of
assets traded for EUR, while the second is
the number of EUR traded against the
respective asset (“Glossary of Trading
Terms, 2023). CCXT does not deliver both
volumes, just the comparable “Volume from”.
Since only this is needed for further use, this
does not pose a problem. As every OHLCV
Figure 7: File structure of gathered historical
data.
data comes with a respective timestamp, the first column is used for this in the format of a
Unix timestamp. This is a commonly used format in development and represents a way to
track time as a running total of seconds starting from the Unix epoch, the January 1
st
, 1970 at
UTC.
Figure 8: Process of gathering data from exchanges for cryptocompare.com and CCXT
Both functions historicalData_cryptocompare() and ‘historicalData_ccxt()for gathering the
required data in OHLCV format from January 1
st
, 2022 to April 1
st
, 2023, can be found as
Attachment A and B. The functions take the desired trading pairs as input, as well as start
and end date, API address, exchange name and timeframe (1h). Both functions perform a
loop that executes the function for each asset and exchange, which are stored as separate
arrays. To speed up the process of these mainly Input/Output (I/O) tasks, those are executed
in a multithreaded manner. The functions involve the steps of creating the directory per
asset, requesting the API for data, setting up the csv file with column headers, creating a
data frame with the requested data, filtering out rows which are outside the set timeframe,
basic cleaning and writing to the csv file.
Creating a
directory per asset
Requesting the API
for data
Setting up the csv
file with column
headers
Creating a
dataframe with
the requested data
Filtering out rows
which are outside
the set timeframe
Cleaning and
indexing
Writing to the csv
file
4.4 Data pre-processing
From the top 100 considered assets, 82 were available to trade against EUR. For the first
step of cleaning the data, a manual review is done, checking through every file and their
respective file sizes to know if they are likely to contain all the intended data.
Manual cleaning steps include:
1. Check every file and their respective file sizes to know if they are likely to contain all
the planned data.
2. If only one exchange file is available per crypto asset, the exchange folder is
removed, as at least two exchanges are needed to perform two-point arbitrage
trading.
3. For some assets, exchange data was available, but contained throughout just one or
a low number of constant values or just the value 0.0. These exchange files are
therefore unusable and deleted.
4. Those deleted files from the previous step are noted and retried from the second data
source CCXT.
Of these 100 considered assets above, 48 remain from gathering data from
cryptocompare.com, are available to trade against EUR, manually cleaned and are of
acceptable data quality as far as can be judged in the first step. The following table lists them
with abbreviation and full name:
1INCH - 1inch Network
AAVE - Aave
ADA - Cardano
ALGO - Algorand
APE - ApeCoin
APT - Aptos
ARB - Arbitrum
ATOM - Cosmos
AVAX - Avalanche
AXS - Axie Infinity
BAT - Basic Attention Token
BCH - Bitcoin Cash
BTC – Bitcoin
CHZ - Chiliz
CRO - Cronos
CRV - Curve DAO
DASH - Dash
DOGE - Dogecoin
DOT Polkadot
EGLD - MultiversX
EOS - EOS
ETC - Ethereum Classic
ETH - Ethereum
FIL - Filecoin
FTM Fantom
GRT - The Graph
ICP - Internet Computer
IMX Immutable
LINK - Chainlink
LRC - Loopring
LTC Litecoin
MANA - Decentraland
MASK - Mask Network
MATIC - Polygon
NEAR - NEAR Protocol
SAND - The Sandbox
SHIB - Shiba Inu
SNX - Synthetix
SOL - Solana
TRX TRON
UNI - Uniswap
USDC - USD Coin
USDT Tether
VET - VeChain
XLM - Stellar
XRP - XRP
XTZ - Tezos
ZEC - Zcash
Table 6: 48 available assets of the 100 considered, with abbreviation and full name.
5 Crypto asset filtering
5.1 Data cleaning & adjustments
From the already pre-processed dataset of the previous method, it became clear during the
implementation that the dataset needed to be further cleaned, as the results were extremely
unrealistic. Data quality was therefore once again lower than assumed after the data
collection. First, some exchanges deliver OHLCV data only after a certain timestamp and
before that only the value 0.0. The reason for this problem can usually be traced back to the
fact that on these exchanges the respective trading pairs were made available for trading
after this timestamp. However, since these are tradable on the current day, no error is thrown
from the time when this was not yet the case. Second, some exchanges deliver for specific
exchanges a fixed value over longer periods of time, even if the volume changes, which
however is mostly within the price range of the other exchanges. For some, the price ranges
are completely out of the possible range, i.e. about 100 - 1000 times the average of the other
exchanges and are therefore considered as wrong. These data points do not appear as an
error at the beginning of the analysis, since data is available and these are also not
considered as null values.
For this reason, the incorrect data must be excluded and gathered again via
cryptocompare.com or CCXT. If the new data is still the same, or not available from the
second source, it will be removed from the comparison.
Exchanges where these problems occur most often, sorted by number of occurrences:
Exchange name
Occurrences
Bitstamp
15
Bitpanda
12
Kraken
9
Binance
6
Bittrex
2
Coinbase
1
Table 7: Exchanges with false data by occurrences
By excluding these problematic exchanges and data points, the crypto asset filtering can
maintain a higher level of reliability and accuracy. Nevertheless, the results calculated based
on the rather low quality of the data set must be taken with caution. However, they can
represent a trend or a benchmark.
5.2 Measurements
In order to assess the price disparities between the assets, the arbitrage index is used, as it
is also applied in other papers that examine arbitrage possibilities. In addition, for assets with
a high arbitrage index, the price differences are calculated relative to the mean price to better
illustrate price discrepancies.
5.2.1 Arbitrage Index
To show the extent of price deviations across exchanges at a specific timeframe, the
arbitrage index is computed, which calculates the maximum price difference between the
exchanges. It gives a measure of the degree of price variation to identify potential arbitrage
opportunities (Duan et al., 2021; Makarov and Schoar, 2020). The initial step involves
calculating the arbitrage index for the given time interval, which for the given data is hourly.
To do this, volume-weighted average price (VWAP) for each hour is determined for every
exchange. The typical price per timeframe is needed first, which is calculated as:
_);"`U&,#7"`%
2345
Y,
-,;
2!62"
[;
738
[,;
973:;
,/
a
,
The VWAP per timeframe is then calculated as the following, where
b
*6&(X%
2345
:&15&
is referred
to as the cumulated volume since the start of the observed price period.
*c>#
2345
Y,
,
b
,-_);"`U&,#7"`%
2345
,d,*6&(X%/,
2345
:&15&
,,
b
*6&(X%
2345
:&15&
,
Subsequently, the maximum price across all exchanges is taken and divided by the minimum
price. Finally, the arbitrage index is averaged at the daily level to reduce the impact of intra-
day volatility.
The implementation to calculate the arbitrage index for each assets over their available
exchanges, represents the function “arbitrageIndex_allCurrencies()”. This function will further
be changed depending on the desired output, for example, the arbitrage index for only one
asset:
def arbitrageIndex_allCurrencies():
arbitrage_indices = {}
# calculate the arbitrage index for each currency
for currency in currency_folders:
currency_name = os.path.basename(currency).split("_")[1]
currency_exchangefiles = glob.glob(os.path.join(currency, '*.csv'))
exchange_data = {}
# format data and calculate the vwap per exchange
for exchangefile in currency_exchangefiles:
ohlcv_data = pd.read_csv(exchangefile)
exchange = os.path.basename(exchangefile).split("_")[0]
# ensure correct datetime format and set as index
ohlcv_data['time'] = pd.to_datetime(ohlcv_data['time'], unit='s')
ohlcv_data = ohlcv_data.set_index('time')
# ensure all value columns are in float format
value_columns = ['open', 'high', 'low', 'close', 'volumefrom']
ohlcv_data[value_columns] = ohlcv_data[value_columns].astype(float)
# exclud rows where all OHLCV values are 0.0, consider as null values
ohlcv_data = ohlcv_data.drop(data[
(ohlcv_data['open'].eq(0.0)) & (ohlcv_data['high'].eq(0.0)) &
(ohlcv_data['low'].eq(0.0)) & (ohlcv_data['close'].eq(0.0)) &
(ohlcv_data['volumefrom'].eq(0.0)) ].index)
# calculate the typical price and then VWAP
ohlcv_data['typical_price'] = (ohlcv_data['low'] + ohlcv_data['close']
+ ohlcv_data['high']).div(3).values
# Cumulative total of price times volume
ohlcv_data['price*volume'] = ohlcv_data['typical_price'] *
ohlcv_data['volumefrom']
ohlcv_data['cumulative_price*volume'] =
ohlcv_data['price*volume'].cumsum()
# Cumulative total of volume, then calculate VWAP
ohlcv_data['cumulative_volume'] = ohlcv_data['volumefrom'].cumsum()
ohlcv_data['vwap'] = ohlcv_data['cumulative_price*volume'] /
ohlcv_data['cumulative_volume']
exchange_data[exchange] = ohlcv_data['vwap'].dropna()
# if the number of exchange per currency is greater than 1, the arbitrage
index is calculated
if len(exchange_data) > 1:
combined_data = pd.concat(exchange_data, axis=1)
# get the max and min vwap per hour
max_vwap_per_minute = combined_data.max(axis=1)
min_vwap_per_minute = combined_data.min(axis=1)
# calculate the arbitrage index
arbitrage_ratios = max_vwap_per_minute / min_vwap_per_minute
# Calculate the average arbitrage ratio at the daily level
arbitrage_ratios = arbitrage_ratios.resample('1D').mean().dropna()
# Add the currency's arbitrage index to the dictionary
arbitrage_indices[currency_name] = arbitrage_ratios
return arbitrage_indices
5.2.2 Price differences
To analyse price differences in relation to the mean price for each currency over a given
period of time, the percental deviation of each data point must be calculated. This helps to
understand the degree of price variation and identify potential arbitrage opportunities. As for
the arbitrage index, the typical price is calculated first. Then the mean price is calculated over
the entire period and the price differences per data point.
The implementation to calculate the price deviations per asset assets over their available
exchanges, represents the function “relative_priceDifferences_byMean()”:
def relative_priceDifferences_byMean(currency):
average_prices = {}
currency_exchangefiles = glob.glob(os.path.join(
f'../data/historical_{currency}_EUR', '*.csv'
))
# format data and calculate average price
for exchangefile in currency_exchangefiles:
ohlcv_data = pd.read_csv(exchangefile)
exchange = os.path.basename(exchangefile).split("_")[0]
# ensure correct datetime format and set as index
ohlcv_data['time'] = pd.to_datetime(ohlcv_data['time'], unit='s')
ohlcv_data = ohlcv_data.set_index('time')
# ensure all value columns are in float format
Value_columns = ['open', 'high', 'low', 'close', 'volumefrom']
ohlcv_data[value_columns] = ohlcv_data[value_columns].astype(float)
# exclude rows where all OHLCV values are 0.0, consider as null values
ohlcv_data = ohlcv_data.drop(ohlcv_data[
(ohlcv_data['open'].eq(0.0)) & (ohlcv_data['high'].eq(0.0)) &
(ohlcv_data['low'].eq(0.0)) & (ohlcv_data['close'].eq(0.0)) &
(ohlcv_data['volumefrom'].eq(0.0))
# calculate the typical price and
ohlcv_data['typical_price'] = (ohlcv_data['high'] + ohlcv_data['low'] +
ohlcv_data['close']).div(3).values
average_prices[exchange] = ohlcv_data['avg_price']
average_prices_df = pd.concat(average_prices, axis=1)
# calculate the mean price and the price differences relative to the mean price
mean_prices = average_prices_df.mean(axis=1)
price_differences = average_prices_df.sub(mean_prices, axis=0)
return price_differences
6 Arbitrage trading prototype
In this chapter the aspects involved in the development and implementation of the arbitrage
trading prototype are described. The goal of this system is to find arbitrage opportunities,
exploit the differences of asset prices and simulate trades, as paper trading.
6.1 Programming Language
The selection of a suitable way of developing the concept for the arbitrage trading system is
essential. It is already known that a key point for success is how quickly a trading system can
search and transmit information, specifically in terms of speed and latency of other traders
(Brogaard et al., 2014; Brogaard and Garriott, 2019; Budish et al., 2015; Carrion, 2013;
Kiuchi, 2022; Levus et al., 2021; O’Hara, 2015). To achieve this, the system must be
developed in a convenient and fast approach. In the scope of this thesis, the system is
implemented using the Python programming language.
6.2 Prototype Architecture
The architectural approach of this arbitrage prototype, as seen in Figure 9, involves three
actors, with their corresponding data streams and overview of their functions. The first, is the
trading person, called the trader, which controls the system. This is who starts and stops the
prototype and provides the desired websocket URLs, API keys and trading pairs.
Additionally, the trader gets continuously notified about system updates of interest, which can
be all incoming price updates, found arbitrage opportunities or simulated trades. The
invoking function, which starts the system is “asyncio.run(main)”, as seen below. The second
actor is the arbitrage prototype, which, in the scope if this thesis, runs locally on a computer.
This handles the business logic by connecting to the exchanges, subscribing to the desired
information, mapping the price streams into the same format, finding arbitrage opportunities
including trading fees and simulating trades.
Trader
Arbitrage
Prototype
Exchange 1
Exchange n
Websocket
Websocket
asyncio.run(main())
websocketsConnector()
websocketSubscriber()
mapBest_ask_price()
arbitrageOpportunityFinder()
calulcateFees()
simulateTrade()
1.
2.
3.
4.
5.
6.
7.
Figure 9: Architectural approach with three actors and their corresponding data streams
In this example the arbitrage prototype connects to the websockets of the exchanges
Binance and Kraken. To benefit performance and handle the continuous I/O tasks efficiently,
the software runs asynchronous to enable parallel websocket connections, for which the
‘asyncio’ library is used. As the incoming price streams of the exchanges websockets, must
be analysed for profitable price differences, the price streams are stored in a temporary
‘price_data’ dictionary. This gets cleared every 300 milliseconds, due to only the last
arbitrage opportunities should be found and longer time intervals lead to misleading results.
The number of milliseconds was chosen arbitrary and 300 has proven to be a good time
window in tests, in the scope of this thesis.
Figure 10: File structure of the arbitrage trading prototype
The file structure of the arbitrage trading system contains of a main, functions, dictionaries
and database file, as seen in Figure 10. The “main.py” file represents the core of the system,
from which all functions are invoked in an asynchronous manner. The “functions.py” file,
contains all the business logic. The “dicts.py” file contains the dictionaries with the provided
input information for the system. Additionally, the “db.py” file saves the desired logs of found
arbitrage opportunities or simulated trades to a local db.
6.3 Finding Arbitrage Opportunities
In this thesis the chosen type of arbitrage trading is pure arbitrage or also called two-point
arbitrage, as explained before. For calculating the possible profit, calculations must be done
for the arbitrage opportunity and the relating fees.
For each trade
"
, the profit
#
!
is equal to the difference between the sell
$%&&
!
and buy
'()
!
price multiplied by the volume
*
!"
traded. The formula to calculate the profit is (Pauna, 2018):
#
!
+ ,*
!
-$%&&
!
.,'()
!
/
Taker fees for trading the quantity
8 9 :,
on exchange i can be calculated by the following
formulas (Hautsch et al., 2018). Where A (Ask) is the respective asset price and
;
!#%
-
<
/
9 :
.
>
!
&
-
8
/
,+,>
!
&
,
e
0. ;
!#%
-
8
/
f
6.4 Exchange Connections
To programmatically interact with different exchanges, it is necessary to work with APIs. For
the type of API interfaces, there are two options exchanges offer to communicate in both
directions, which are Representational State Transfer (REST) and Websocket. As real time
market data is needed from several exchanges to identify arbitrage opportunities fast, the
call/response mechanism of REST, would need to be called multiple times in short,
predefined intervals, which would not be suitable. In addition, the problem of rate limits arises
again, which would represent a limitation or could result in temporary or permanent bans,
when implementing a high frequency arbitrage trading system.
Needed is a real-time data stream, offering price updates in the intervals available per
exchange. For this use case Websockets are used, which first send a Websocket protocol
handshake and then establish an open stream via a Transmission Control Protocol (TCP), if
the request to the server was successful. Websocket addresses also use a different scheme,
which is wss instead of https. Therefore, the websocket connection URL from the
exchange Kraken looks like “wss://ws.kraken.com”. Once a connection has been
established, the next step is to subscribe to the desired channel. In the case of this thesis,
order book data of the respective trading pairs are needed, from which the highest ask price
is filtered to execute a market order, as these can be executed immediately on centralized
exchanges. For this, only the first 10% of the order book is needed, which also results in a
smaller amount of incoming data and therefore less data to compute (“Websocket API |
Binance Developers,” 2023; “Which API should I use?,” 2023).
As already mentioned, the prototype gets invoked from the “main.py” file in an asynchronous
manner. This happens for every provided websocket URL in the dictionary and connects to
the different websockets through the “websocketsConnector()” function, which takes the URL
and exchange name as input:
async def websocketsConnector(exchange_name, websocket_url):
async with websockets.connect(websocket_url) as websocket:
await websocketSubscriber(websocket, exchange_name)
while True:
message = await websocket.recv()
mapped_data = mapBest_ask_price(json.loads(message), exchange_name)
if mapped_data is not None:
price_data[exchange_name] = mapped_data
arbitrageOpportunityFinder()
await websocketMessager(mapped_data)
The “websocketsConnector()” invokes the next function “websocketSubscriber()” to
subscribe to the desired channels and receive the corresponding price streams. As the
subscription requests differ in format per exchange, these are sent depending on the passed
exchange name:
async def websocketSubscriber(websocket, exchange_name):
if exchange_name == "kraken":
await websocket.send(json.dumps(
{
"event":"subscribe",
"subscription": {
"name": "book","depth": 10
},
"pair": trading_pairs["kraken"]
}))
elif exchange_name == "binance":
await websocket.send(json.dumps(
{
"method": "SUBSCRIBE",
"params": [
f"{pair.lower()}@depth@100ms" for pair in trading_pairs["binance"]
],
"id": 1
}))
Those are getting mapped to find the best ask price, through the “mapBest_ask_price()”
function and assigned to the temporary “price_data” variable. As different exchanges use
different formats for trading pairs, they are mapped to the ASSET/EUR format, which is used
in most cases. Additionally, for instance Kraken abbreviates Bitcoin internally as XBT,
instead of the commonly used BTC acronym. This function returns the best ask price as a
float, the symbol and exchange name:
def mapBest_ask_price(raw_data, exchange):
if exchange.lower() == 'binance':
if isinstance(raw_data, dict) and 'e' in raw_data and raw_data['e'] == 'depthUpdate'
and 'a' in raw_data and len(raw_data['a']) > 0:
symbol = raw_data['s'].replace('EUR', '/EUR')
return float(raw_data['a'][0][0]), symbol, exchange
elif exchange.lower() == 'kraken':
if isinstance(raw_data, list) and len(raw_data) >= 2 and isinstance(raw_data[1], dict)
and 'a' in raw_data[1]:
symbol = raw_data[-1].replace('XBT', 'BTC')
return float(raw_data[1]['a'][0][0]), symbol, exchange
else:
raise ValueError("Unsupported exchange")
As seen in the “websocketsConnector()” function, there’s also a logger function, which simply
logs the output in the format “Exchange: Best Ask Price, Symbol”, if needed:
async def websocketMessager(mapped_data):
if mapped_data is not None:
price, symbol, exchange = mapped_data
print(f"{exchange}: {price} , {symbol}")
The “websocketsConnector()” function, finally invokes the “arbitrageOpportunityFinder()”
function to check if arbitrage opportunities exist. The system loops through the temporary
"price_data" dictionary, which contains the mapped price stream of the exchanges, and
depending on the trading pair, determines the maximum and minimum price in the given
period. The maximum prices are stored in the max_price_Symbol and the minimum prices
are stored in the min_price_Symbol list. These are defined as negative infinity for the
example of maximum values, ensuring that any number encountered in the list will be greater
than the initial value, and so the variable will be updated accordingly as it iterates through the
list. In addition, price, symbol and exchange are also stored in the
max/min_price_data_Symbol tuples. Further, trading fees are also included, which can be
seen in the following function. In this example, an arbitrary value is taken, looking for price
differences that are at least 1%. The higher this percentage is set, the less often arbitrage
opportunities are found naturally, but therefore the possibility is also higher that they can be
successfully converted. If such a predefined arbitrage opportunity is found, the system prints
it to the console, with the corresponding information.
def arbitrageOpportunityFinder():
if price_data is not None:
max_price_BTC = max_price_ETH = -float('inf')
min_price_BTC = min_price_ETH = float('inf')
max_price_data_BTC = min_price_data_BTC = max_price_data_ETH =
min_price_data_ETH = None
for exchange, (price, symbol, exchange_name) in price_data.items():
if symbol == 'BTC/EUR':
if price > max_price_BTC:
max_price_BTC = price
max_price_data_BTC = (price, symbol, exchange)
if price < min_price_BTC:
min_price_BTC = price
min_price_data_BTC = (price, symbol, exchange)
if symbol == 'ETH/EUR':
if price > max_price_ETH:
max_price_ETH = price
max_price_data_ETH = (price, symbol, exchange)
if price < min_price_ETH:
min_price_ETH = price
min_price_data_ETH = (price, symbol, exchange)
else:
pass
max_price_BTC_wFees = calculateFees(max_price_BTC, exchange_name)
min_price_BTC_wFees = calculateFees(min_price_BTC, exchange_name)
max_price_ETH_wFees = calculateFees(max_price_ETH, exchange_name)
min_price_ETH_wFees = calculateFees(min_price_ETH, exchange_name)
price_difference_BTC = (max_price_BTC_wFees - min_price_BTC_wFees) /
min_price_BTC_wFees
price_difference_ETH = (max_price_ETH_wFees - min_price_ETH_wFees) /
min_price_ETH_wFees
if price_difference_BTC > 0.01:
simulateTrade(price_difference_BTC, max_price_data_BTC, min_price_data_BTC)
print(f"Price difference greater than 1%
({price_difference_BTC}):")
print(f"Max price: {max_price_data_BTC}")
print(f"Min price: {min_price_data_BTC}")
if price_difference_ETH > 0.01:
simulateTrade(price_difference_ETH, max_price_data_ETH, min_price_data_ETH)
print(f"Price difference greater than 1%
({price_difference_ETH}):")
print(f"Max price: {max_price_data_ETH}")
print(f"Min price: {min_price_data_ETH}")
In the “arbitrageOpportunityFinder()” function, also the trading fees are calculated using the
function "calculateFees()", at this step still without deposit and withdrawal fees:
def calculateFees(price, exchange):
return price + (price * (exchange_fees[exchange]['Taker Fee'][0]['percentage'] / 100))
If a predefined arbitrage opportunity is found, the next step is to simulate the trade as paper
trading in the function simulateTrade(), which logs it to a csv file, where it can then be
analysed and evaluated again:
def simulateTrade(price_difference, max_price_data, min_price_data):
# write trade to csv
with open('./trades.csv', 'a') as f:
f.write(f"{time.time()},{price_difference*100},{max_price_data[0]},
{max_price_data[1]},{max_price_data[2]},{min_price_data[0]},
{min_price_data[1]},{min_price_data[2]}\n")
6.5 Dictionaries
To encapsulate all static data structures, better flexibility, performance and code reusability,
a separate python file with multiple dictionaries is used. The first is “websocket_urls”, where
all public websocket addresses are stored by exchange name. The second is “trading_pairs”,
where all used formats of trading pairs are stored by exchange name, as they are not
consistent. The third dictionary is “exchange_fees” where all taker fees, deposit EUR and
withdraw EUR Fees are stored by exchange name.
These are the dictionaries used for the exchange Kraken:
websocket_urls = {
"kraken": "wss://ws.kraken.com"
}
trading_pairs = {
"kraken": ["XBT/EUR","ETH/EUR"]
}
exchange_fees = {
'kraken': {
'Taker Fee': [
{
'30day-minAmount': 0,
'30day-maxAmount': 50000,
'percentage': 0.26,
'absolute': None,
},
{
'30day-minAmount': 50001,
'30day-maxAmount': 100000,
'percentage': 0.24,
'absolute': None,
}
],
'Deposit EUR': {
'Credit Card': {
'allowed': True,
'percentage': 3.75,
'additional-fixed': 0.25,
'absolute': None,
},
'SEPA': {
'percentage': None,
'absolute': 1,
}
},
'Withdraw EUR': {
'Credit Card': {
'allowed': False,
'percentage': None,
'additional-fixed': None,
'absolute': None,
},
'SEPA': {
'percentage': None,
'absolute': 1,
}
},
}
}
6.6 Testing
Since automated trading systems involves real money, testing its function is of particular
importance. In the scope of this thesis, a differentiation can be made between detecting
arbitrage opportunities, which will be the main focus, and simulating trades. For the first step
of the detection, this can be tested in detail, as this does not involve the need for executing
real trades yet. The testing type of the trading system used in the scope of this thesis, is
referred to as paper trading, where the last step of the execution is written to a paper and
evaluated for profitability again later. For the stage of setting trades, only speed and latency
are the limitation factor, when converting a detected arbitrage opportunity to a successful
arbitrage trade.
7 Results
7.1 Data collection & management
Of the 100 considered assets in the beginning, 48 remain from gathering data from
cryptocompare.com. Those are available to trade against EUR, cleaned and are of
acceptable data quality as far as can be judged in the first method.
This table shows the 48 assets with abbreviation and full name, which were considered for
further use:
1INCH - 1inch Network
AAVE - Aave
ADA - Cardano
ALGO - Algorand
APE - ApeCoin
APT - Aptos
ARB - Arbitrum
ATOM - Cosmos
AVAX - Avalanche
AXS - Axie Infinity
BAT - Basic Attention Token
BCH - Bitcoin Cash
BTC - Bitcoin
CHZ - Chiliz
CRO - Cronos
CRV - Curve DAO
DASH - Dash
DOGE - Dogecoin
DOT - Polkadot
EGLD - MultiversX
EOS - EOS
ETC - Ethereum Classic
ETH - Ethereum
FIL - Filecoin
FTM - Fantom
GRT - The Graph
ICP - Internet Computer
IMX - Immutable
LINK - Chainlink
LRC - Loopring
LTC - Litecoin
MANA - Decentraland
MASK - Mask Network
MATIC - Polygon
NEAR - NEAR Protocol
SAND - The Sandbox
SHIB - Shiba Inu
SNX - Synthetix
SOL - Solana
TRX - TRON
UNI - Uniswap
USDC - USD Coin
USDT - Tether
VET - VeChain
XLM - Stellar
XRP - XRP
XTZ - Tezos
ZEC - Zcash
Table 8: The 48 assets, for which data was available, with abbreviation and full name
Of the 85 available exchanges, 16 remain. Most of them were sorted out in the process,
because they do not allow trading against Euro and the exchanges therefore send no data.
Furthermore, some were further excluded in the data cleaning of this method, due to various
mentioned criteria.
This table shows the 16 remaining exchanges with abbreviation and full name, which were
considered for further use:
binance - Binance
bitfinex - Bitfinex
bitflyer - Bitflyer
bitpanda - Bitpanda
bitstamp - Bitstamp
bittrex - Bittrex
blockchaincom - Blockchain.com
cexio - CEX.IO
Coinbase - Coinbase
coinfield - CoinField
currency - Currency.com
exmo - Exmo
gemini Gemini
kraken - Kraken
lmax - LMAX Digital
paymium - Paymium
Table 9: The 16 remaining exchanges from which data was available, with abbreviation and full name.
To show a data example, the first 24 hours of the desired time window from January 1
st
,
2022 to April 1
st
, 2023 are shown here. This sample is from the exchange Binance for the
trading pair ETH/EUR, downloaded over the cryptocompare.com API:
Figure 11: Data example of a Binance for ETH/EUR over the specified timeframe.
7.2 Crypto asset filtering
Two measures were chosen to filter the 48 considered crypto assets for arbitrage
opportunities, over the last 16 months. First the arbitrage index was calculated for all assets,
across their available exchanges and then 10 each, depending on specific factors. With
those results, the price differences were computed and visualized for the top 10 assets with
the highest mean of arbitrage index.
7.2.1 Results for Arbitrage Index
By calculating the arbitrage Index for all exchanges and over all their respective available
exchanges, the results are as shown in the following graph. It can already be seen that a
trend for arbitrage opportunities exists, as the arbitrage index should always equal 1, in an
efficient market, where no price discrepancies can be found between markets.
Figure 12: Arbitrage Index for all 48 assets overall their available exchanges
In the next, Figure 13, to gain better insights, just the top ten assets, by market cap, are
plotted. These are Bitcoin (BTC), Ethereum (ETH), US-Dollar Coin (USDT), Ripple (XRP),
Cardano (ADA), Dogecoin (DOGE), Polygon (MATIC) und Solana (SOL). It can now be seen
that Solana has the highest arbitrage index of those, which makes it the most possible of this
selection to encounter arbitrage opportunities across exchanges. In contrast, it can also be
seen that USDT, which is a "stablecoin", is linked to the US dollar and, as the name already
implies, should be stable. But a significant price jump can be observed around
February/March 2022, which must be due to errors in the data. From the following statistical
table in Figure 14, one can conclude that, SOL has the highest maximum value, standard
deviation, as well as mean value. It can also be seen that the two stablecoins in the list
USDC and USDT have the lowest mean value.
Figure 13: Arbitrage Index for the top 10 assets (by Market cap) overall their available exchanges
Figure 14: Corresponding table with statistical insights about the arbitrage Index for the top 10 assets
(by market cap)
In comparison, in the next Figure 15, just the lowest ten assets by market cap from the
considered list are plotted. These are 1inch Network (1INCH), Dash (DASH), Zcash (ZEC),
Synthetix (SNX), Curve DAO Token (CRV), Chiliz (CHZ), Immutable (IMX), Axie Infinity
(AXS), Tezos (XTZ) and Aave (AAVE). Already, there can generally be seen a higher
volatility and higher indexes overall. While the majority are considerable steady in price,
1INCH, XTZ, AAVE and AXS stand out with a substantial higher arbitrage index. From the
following statistical table in Figure 16, XTZ has the highest maximum value, standard
deviation, as well as mean value. DASH in comparison has the lowest of those values.
Figure 15: Arbitrage Index for the 10 assets with the lowest market cap of the considered list, overall
their available exchanges
Figure 16: Corresponding table with statistical insights about the arbitrage Index for the lowest 10
assets of the considered list (by market cap)
In the following Figure 17, the assets with the highest mean of arbitrage indexes are
displayed. These are Decentraland (MANA), Cosmos (ATOM), Loopring (LRC), Solana
(SOL), Tezos (XTZ), NEAR Protocol (NEAR), 1inch Network (1INCH), Algorand (ALGO),
Shiba Inu (SHIB) and Polkadot (DOT). As expected for the assets with highest mean, those
can be considered as the most likely to generate arbitrage opportunities. As from the
statistical table, it can be concluded, that MANA is the asset with highest mean and standard
deviation and DOT, the one with the lowest values.
Figure 17: Arbitrage Index for the 10 assets with the highest mean, overall their available exchanges
Figure 18: Corresponding table with statistical insights about the arbitrage Index for the 10 assets with
the highest mean
In the following Figure 19, the assets with the lowest mean of arbitrage indexes are
displayed. These are Ethereum Classic (ETC), USD Coin (USDC), Arbitrum (ARB), Zcash
(ZEC), Tron (TRX), Mask Network (MASK), Dash (DASH), Cronos (CRO), Synthetix (SNX)
and Filecoin (FIL). From these, it can be concluded that those are the most unlikely assets to
expect arbitrage opportunities from. As from the statistical table, it can be concluded that
ETC is the asset with lowest mean and CRO with the lowest standard deviation. Noteworthy
about this graph is, that CRO has a mean close to zero till November 2022, but since then it
has been rising strongly and continuously.
Figure 19: Arbitrage Index for the 10 assets with the lowest mean, overall their available exchanges
Figure 20: Corresponding table with statistical insights about the arbitrage Index for the 10 assets with
the lowest mean
In order to get a better picture of all 48 assets, the entire table is shown below in Figure 21.
This contains the abbreviations on the y-axis and the mean value, standard deviation,
minimum and maximum value in the columns. The table is sorted by the mean value in
ascending order of the considered timeframe, the last 16 months.
Figure 21: All 48 assets, sorted by the arbitrage index mean value of the last 16 months.
7.2.2 Results for Price Differences
On the basis of the results of the arbitrage index calculations, following are five computations
for the price differences per asset over the last 16 months, over all available respective
exchanges. Considered here are the top 10 crypto currencies with the highest mean value of
the arbitrage index. For some of these, price differences occur in high frequency, but for
some in low frequency and therefore clearly visible on a plot. In the following, the results for
five assets are shown, for which the price differences are easily visible in the plot as well as
in the data.
In addition to the plots, a corresponding table is shown per asset with the highest price
differences occurring. This also includes the lower price exchange, higher price exchange
and the maximum price difference. To get a better comparison, the time component is also
included, with "Time" of the price difference, as well as its "Start Time", "End Time" and the
duration of the price difference of the time window of the arbitrage opportunity. In addition, in
order to be able to assess the price difference better, the mean price at time is also given for
all exchanges and in relation to this, the percentage price difference.
Figure 22: Relative Price Differences to the mean price for 1INCH
Figure 23: Price Differences to the mean price table for 1INCH
Four exchanges are available for 1INCH, which experience clear price differences. At the
lowest and highest price of the exchanges, it is balanced and these change regularly. The
highest relative price difference here was even around 39% and these two highest even
lasted for 12 days.
Figure 24: Relative Price Differences to the mean price for ALGO
Figure 25: Price Differences to the mean price table for ALGO
Four exchanges are available for ALGO, which also experience clear price differences, but
are definitely the largest at the beginning of the period and become increasingly smaller over
time. When looking at the lowest and highest price of the exchanges, one can see that Exmo
has the highest prices and is rarely among the lowest. The highest relative price difference
was around 11%, but all arbitrage opportunities disappeared again within one day.
Figure 26: Relative Price Differences to the mean price for ATOM
Figure 27: Price Differences to the mean price table for ATOM
Six exchanges are available for ATOM, which also experience clear price differences and
become smaller towards the end of the period. The lowest and highest prices of the
exchanges are balanced. The highest relative price difference here was around 35%. Most of
the arbitrage opportunities disappeared within one day, but one of about 15% remained over
six days.
Figure 28: Relative Price Differences to the mean price for MANA
Figure 29: Price Differences to the mean price table for MANA
Four exchanges are available for MANA. It can be seen that phases were largest in
April/May 2022, a bit smaller in July to August 2022 and smaller again in November 2022.
From January to April 2023, MANA again experienced a phase of larger price differences.
The exchanges are balanced at the lowest and highest price. The highest relative price
difference here was around 18%. Most arbitrage opportunities disappeared within a day, but
they remain for one or also three days.
Figure 30: Relative Price Differences to the mean price for XTZ
Figure 31: Price Differences to the mean price table for XTZ
Four exchanges are also available for XTZ. Price differences were obviously smaller here,
but experienced few but large outliers. Looking at the lowest and highest price of the
exchanges, one can see Cex.io with the highest prices. The highest relative price difference
was even 60%. The strong but short price differences remained for two days or disappeared
again within one day.
7.3 Arbitrage trading prototype
The functionality and approach of the arbitrage trading prototype was already discussed in
detail. Here, the results of the prototype will be evaluated. When the prototype is started, it
runs until it is explicitly closed again. The results of the arbitrage opportunities found, are
logged in the console and the trade is simulated with market orders, to test the viability of the
identified openings, by writing the price differences, with minimum and maximum price and
the respective symbol and exchange name to a csv file. In addition, a Unix timestamp is
added.
For this test, the arbitrage prototype was executed for Binance and Kraken for the trading
pairs BTC/EUR and ETH/EUR, for eight hours on May 10th with a predefined minimum
percentage of a price difference of 1% in this example. The following Figure 32, shows a
snapshot of the first 20 arbitrage opportunities found:
Figure 32: Example of the prototype logging to csv, where arbitrage opportunities were found.
In this example the highest found arbitrage opportunity was 9.9% for ETH/EUR at 15:43
CEST, where Binance had the higher price. In this example, trading fees are already
included, as they occur before the relative price difference of 1% is calculated. In those eight
hours, Binance was always the higher priced and Kraken the lower priced exchange for
Bitcoin and Ethereum. Therefore, opportunities for both assets were found. It must be noted
that in the case of a used arbitrage opportunity, i.e. when a trade is also executed, the
opportunities that follow afterwards are no longer taken into account for a short period of
time. This results from the necessary recalculation of the available assets on the exchanges
in order to be able to sell them on the higher-priced ones, which would be required in an
autonomous arbitrage trading system on the real market. Thus, the amount of directly
consecutive arbitrage opportunities on the exchanges in this example cannot be considered
as an opportunity. The number of arbitrage opportunities found are consequently exemplary.
8 Discussion
8.1 Interpretation of results
Results shown in this thesis for occurring arbitrage opportunities were notable and in line
with other papers (Brauneis and Mestel, 2018; Duan et al., 2021; Makarov and Schoar,
2020). Within the scope of this thesis, it was shown that arbitrage possibilities exist, based on
data of the last 16 months, from January 1
st
, 2022 to April 1
st
, 2023. Most of the time, these
disappear again within a day, but it was also possible to show that they can last a few or
even up to 12 days. Furthermore, it was shown that relative price differences in relation to
the mean price usually amount to a maximum of around 30%, but in exceptional cases can
even reach a difference of up to 60%.
To raise again the question of how trustworthy the results of the method for crypto asset
filtering are, it must be mentioned again that the quality of data turned out to be not very high
from the free sources considered. In addition, the OHLCV data were only available in hourly
intervals. This means that the open, high, low, close price is taken per hour, where
considerable price changes can already occur in one hour. To ease this limitation a little, the
mean price of the day was already taken to reduce intraday volatility. In addition, it is
extremely unlikely that these recorded prices occur at the same moment, e.g. in a fraction of
a second, when decisions for a trade are being made. For this reason, the results cannot be
regarded as straightforward, as prices on the crypto market can change quickly. However,
they can certainly be seen as a trend, since arbitrage possibilities could also be
demonstrated over longer periods of time in this thesis.
For the results of the arbitrage prototype, it could be shown, consistent with the points above,
that these opportunities can be found. These could also be detected in a test example for
Bitcoin and Ethereum on the exchanges Binance and Kraken, over a period of 8 hours. The
price differences were relatively up to nine percent. Subsequent trades with market orders
were simulated by logging them to a csv. In order to be able to execute these trades
successfully, other points such as execution and latency time of the exchanges come into
effect, as well as assets available for sale on various exchanges. As these arbitrage
possibilities also existed for a short, but sufficient period of time, it can be assumed that they
can be used profitable. Testing the simulation with trade execution, with a starting capital,
forms the future work of this thesis, among other activities.
In the following the research questions are answered, starting with the main, how trading
strategies of financial markets, such as arbitrage trading, also are applicable with assets on
crypto markets in an automated way. In this thesis it could be shown that arbitrage trading
can also be used on assets on the crypto market. First, because existing price differences
were found in the method of crypto asset filtering, as well as subsequently in the prototype.
Second through utilising a system, written in a suitable programming language for the use
case, which finds arbitrage opportunities and executes trades in an automated way. It can
thus be said that arbitrage trading can also be applied to the crypto market and has turned
out to be well-suited because of the suitable characteristics of the market.
RQ 1.1 posed the question of which information must be gathered to enable decision making
for arbitrage trading. For that, a dataset is needed that includes order prices per time
intervals at which trades have taken place. Suitable is historical data in OHLCV (Open, High,
Low, Close, Volume) format, which is available per asset and per exchange. For the
accuracy of the calculations, the smallest possible time interval is recommended. In addition,
for the exchanges or assets, information about trading fees, transfer times and
regulatory/legal information must be gathered, as well as a certain market liquidity to enable
trading.
RQ 1.2 raised the question of which requirements and criteria exist for a crypto asset to be
considered for arbitrage trading. In order to trade using two-point arbitrage method, at least
two exchanges are needed on which an asset is listed, which should be available in the
trader’s region and offer a certain security and reliability. For this, it must also have sufficient
liquidity to be traded, which is not relevant for centralised exchanges, as they would
otherwise not be offered. In addition, a crypto currency should have low transaction fees to
mitigate the possibility of these absorbing the profits, which again is not as relevant on CEX
as on DEX. If these criteria are met, the asset should be tested to see if it is likely to offer
arbitrage opportunities. Crypto currencies like USDT or USDC, which are stablecoins, can be
excluded, as the name implies.
RQ 1.3 presents the question, in how an arbitrage trading strategy in the crypto market can
be realized as a software prototype. For this, a performant programming language suitable
for the use case must be chosen, which can be extended with systems for specific purposes.
At least three actors are required for this, which are the trader himself, the arbitrage trading
prototype and the exchanges to be attached. Those must be able to communicate with each
other and the prototype must connect to the exchanges via websockets to receive real-time
data. Subsequently, the system needs to be extended with business logic, by subscribing to
the exchanges desired information, mapping the price streams into the same format, finding
arbitrage opportunities including trading fees and executing trades. Then the software must
be continuously tested, improved and extended with functions that enable fully autonomous
arbitrage trading. As shown in this thesis, when implemented this way, an arbitrage trading
prototype can find opportunities and simulate trades based on them. To implement this
strategy fully and autonomously, further work is needed, such as real trade execution or a
balancing strategy of assets between the exchanges, as described in the chapter Outlook
and Future Work.
8.2 Limitations
Two limitations have emerged in the course of this work. First, testing, since this is an
academic thesis, the possibilities for tests with real money on the crypto market are limited.
For this reason, paper trading was used, where the last step of the trade execution was
logged to a csv file with all details that would otherwise have represented a trade. As
centralized exchanges offer the possibility of market orders and these are used in the
prototype to guarantee the price at which a trade is placed, these values can still be
considered as realistic. One factor that cannot be considered, however, is execution time.
This means the time it takes to actually execute a trade, since a minimum amount of time
elapses during the duration of data processing, recognition of the arbitrage opportunity and
the decision to place a trade. In order to be able to confirm the results of this thesis, it would
therefore require not only a substantial starting capital but also a longer test phase or
frequent sample tests over a longer period of time, as arbitrage opportunities vary over time.
The second limitation is the previously mentioned data quality. Since free sources of crypto
asset prices from various exchanges were used in this work, it became apparent that they
are of significantly lower quality. This was noticeable in comparison to data quality of other
papers that used paid data sources, which can cost up to several thousand euros. In
addition, it was not possible to obtain OHLCV data per minute via free resources, only per
hour, which makes prices difficult to evaluate. From these four known prices per hour, the
typical price was used, but due to the nature of rapidly changing prices in the crypto market,
it is very unlikely that the calculated prices occurred at the same moment. Since arbitrage
opportunities only occur when assets are mispriced at the same time, an average price per
hour is not a meaningful indicator. For this reason, the results of the second method are to
be considered as a trend, but not as a basis for any monetary decisions.
With financial resources, these limitations can be overcome and hence form a point in the
next chapter. Due to these limitations, which could not be tackled in the scope of this thesis,
no hypotheses to prove could be established sustainable, such as whether profits can be
realised on the crypto market with arbitrage trading.
8.3 Outlook & Future work
Limitations are given due to the general circumstances like money or time, or simply tasks,
which are out of the scope of this thesis, but would benefit the arbitrage prototype. As in the
scope of this thesis an arbitrage prototype was developed, it takes some extra methods to
build a fully autonomous arbitrage trading system. Following are seven areas which propose
future work and should be done to foster and enhance the findings of this paper.
To begin with the limitations, first a re-test of the arbitrage measures of the second method of
this thesis can be carried out, based on data of higher quality, i.e. from paid sources. Since
the calculations of the arbitrage index and the price differences are already in place, it is
easy to run the calculations again with new data in the same file structure and in OHLCV
format.
Second, substituting trade simulation with real trade execution on the crypto market,
represents the next element of future work, which can be initiated with starting capital to buy
and hold assets on different exchanges to sell. Based on this, further improvements and
benchmark tests can be conducted in order to achieve a high performance of the arbitrage
software in the interaction of the processing power of the executing computer or server and
the latency of the exchanges.
Third, in order to run an arbitrage system fully autonomously, a balancing strategy between
the exchanges is needed to redistribute assets. Since purchases are made on exchanges
with low prices, they consequently accumulate one-sidedly. These must be distributed to
other exchanges with currently higher prices. Otherwise, the arbitrage possibilities expire and
cannot be used if it is not possible to sell on the more expensive exchange. For this reason,
this step is particularly important to develop an autonomous arbitrage system, since poorly
distributed assets mean that arbitrage opportunities can only be exploited to a limited extent.
Fourth, an extension method for the optimal trading amount per crypto asset can be added.
In the context of this thesis, the basic functions for an arbitrage prototype were developed, so
currently crypto assets are only seen as a whole unit. However, since several of these can
be bought at once or only a fraction, this should be taken into account depending on the
available money or assets per exchange.
Fifth, the incoming price streams of the websockets are only passed through a simple
dictionary in Python. This works for the test example mentioned in this thesis with BTC/EUR
and ETH/EUR on Binance and Kraken. However, if assets and exchanges increase, the
process of data streams and data mapping should be handled by a dedicated server, such
as the event streaming platform Apache Kafka.
Sixth, to enhance the user experience, a Graphical User Interface (GUI) can be added. This
can either be extended in Python by a simple local program or by a web application such as
Angular. Currently, it is only available as a command line tool, to start and stop, but it could
also be used to set custom trade thresholds, data stream overview, profit overview, asset
distribution on the exchanges, live events of the software and many more.
Final, in addition to the two-point arbitrage used in this thesis, through which market
inefficiencies can be exploited across two exchanges, triangular arbitrage can also be tested
in order to exploit inner inefficiencies of exchanges.
9 Conclusion
In conclusion, this thesis investigated the feasibility of arbitrage trading in the cryptocurrency
market, as in traditional financial markets. This was examined in a broader spectrum, through
the inclusion of ultimately 48 crypto assets within the top 100 with the highest market cap,
traded against EUR, on 16 exchanges. Based on those, a method to filter crypto assets by
historic data was developed, which aimed to identify the ones with the highest possibility of
experiencing price inefficiencies across exchanges. Further, these findings were extended
with the development of an arbitrage trading prototype, which finds arbitrage opportunities
using continuous real-time data of exchanges and simulates trades with market orders via
paper trading to test the viability of the identified openings. This implementation was done,
utilising centralized crypto exchanges.
In summary, the results show that over the past 16 months, the history of cryptocurrency
exchanges has been characterised by recurring episodes of opening and closing arbitrage
opportunities, as well as a few periods of high arbitrage spreads that lasted for up to twelve
days. Furthermore, it was shown that relative price differences usually amount to a maximum
of around 30%, but in exceptional cases can even reach a difference of up to 60%. In
addition, arbitrage opportunities increased significantly from the half year 2022 onwards for
most assets and remained at a higher level than in the first half year 2022 until April 2023.
Most of the time, prices equalize again within a day, when arbitrageurs make use of them,
but at times it seems that arbitrage capital gets overwhelmed by the speculators or other
characteristics in the crypto market. The results of this thesis therefore indicate that arbitrage
opportunities do exist in this market, even if they should not in theory, as the economic
equilibrium supporting hypotheses like the Efficient-Market Hypothesis, or the Law of one
Price propose. The outcomes are therefore consistent with other connected papers for
arbitrage opportunities on the crypto market from 2018 to 2021 (Brauneis and Mestel, 2018;
Duan et al., 2021; Makarov and Schoar, 2020).
Possible reasons for the occurrence of price differences are several aspects. In summary,
these include the low level of regulation, the high number of speculators, a continuously
available and easily accessible market and the independent pricing of each asset on every
exchange. Another possible reason for the emergence of arbitrage opportunities for assets
traded against fiat currencies are various capital controls, which supports the point of lower
regulations, justifying that arbitrage spreads are smaller in two-way cryptocurrency trades, for
example Bitcoin to Ethereum, than against Dollar or Euro (Makarov and Schoar, 2020).
Overall, the results of this thesis suggest that arbitrage trading can be considered as a
profitable strategy in the cryptocurrency market and that the methods and the prototype
developed can be valuable tools for traders looking to leverage these opportunities.
However, it is important to note that the scope of this thesis included trade simulation and did
not execute actual trades in the real market. Further research and development would be
required to evaluate the performance of the proposed strategy in practice to implement a fully
autonomous arbitrage trading system.
References
[1] Al-Yahyaee, K.H., Mensi, W., Ko, H.-U., Yoon, S.-M., Kang, S.H., 2020. Why
cryptocurrency markets are inefficient: The impact of liquidity and volatility. The North
American Journal of Economics and Finance 52, 101168.
https://doi.org/10.1016/j.najef.2020.101168
[2] Angerer, M., Neugebauer, T., Shachat, J., 2023. Arbitrage bots in experimental asset
markets. Journal of Economic Behavior & Organization 206, 262278.
https://doi.org/10.1016/j.jebo.2022.12.004
[3] Bachelier, L., 1900. Theory of speculation. Scientific Annals of the École normale
supérieure 17, 2186.
[4] Barbon, A., Ranaldo, A., 2021. On The Quality Of Cryptocurrency Markets:
Centralized Versus Decentralized Exchanges.
https://doi.org/10.48550/ARXIV.2112.07386
[5] Berg, J.A., Fritsch, R., Heimbach, L., Wattenhofer, R., 2022. An Empirical Study of
Market Inefficiencies in Uniswap and SushiSwap.
https://doi.org/10.48550/ARXIV.2203.07774
[6] Bitfinex | Our Fees [WWW Document], 2023. URL https://www.bitfinex.com/fees/
(accessed 3.30.23).
[7] Black, F., 1971. Toward a Fully Automated Stock Exchange, Part I. Financial
Analysts Journal 27, 2835. https://doi.org/10.2469/faj.v27.n4.28
[8] Böhme, R., Christin, N., Edelman, B., Moore, T., 2015. Bitcoin: Economics,
Technology, and Governance. Journal of Economic Perspectives 29, 213238.
https://doi.org/10.1257/jep.29.2.213
[9] Bouoiyour, J., Selmi, R., 2015. What Does Bitcoin Look Like? Annals of Economics
and Finance 16, 449492.
[10] Bouoiyour, J., Selmi, R., Tiwari, A., 2014. Is Bitcoin business income or speculative
bubble? Unconditional vs. conditional frequency domain analysis. MPRA Paper
59595.
[11] Brauneis, A., Mestel, R., 2018. Price discovery of cryptocurrencies: Bitcoin and
beyond. Economics Letters 165, 5861. https://doi.org/10.1016/j.econlet.2018.02.001
[12] Brogaard, J., Garriott, C., 2019. High-Frequency Trading Competition. J. Financ.
Quant. Anal. 54, 14691497. https://doi.org/10.1017/S0022109018001175
[13] Brogaard, J., Hendershott, T., Riordan, R., 2014. High-Frequency Trading and Price
Discovery. Rev. Financ. Stud. 27, 22672306. https://doi.org/10.1093/rfs/hhu032
[14] Bruzgė, R., Šapkauskienė, A., 2022. Network analysis on Bitcoin arbitrage
opportunities. The North American Journal of Economics and Finance 59, 101562.
https://doi.org/10.1016/j.najef.2021.101562
[15] Buchholz, M., Delaney, J., Warren, J., Parker, J., 2012. Bits and Bets Information,
Price Volatility, and Demand for Bitcoin. Economics 312.
[16] Budish, E., Cramton, P., Shim, J., 2015. The High-Frequency Trading Arms Race:
Frequent Batch Auctions as a Market Design Response*. The Quarterly Journal of
Economics 130, 15471621. https://doi.org/10.1093/qje/qjv027
[17] Carhart, M.M., 1997. On Persistence in Mutual Fund Performance. The Journal of
Finance 52, 5782. https://doi.org/10.1111/j.1540-6261.1997.tb03808.x
[18] Carrasco Blázquez, M., De la Orden De la Cruz, C., Prado Román, C., 2018. Pairs
trading techniques: An empirical contrast. European Research on Management and
Business Economics 24, 160167. https://doi.org/10.1016/j.iedeen.2018.05.002
[19] Carrion, A., 2013. Very fast money: High-frequency trading on the NASDAQ. Journal
of Financial Markets 16, 680711. https://doi.org/10.1016/j.finmar.2013.06.005
[20] CCXT CryptoCurrency eXchange Trading Library, 2023.
[21] CCXT - Documentation [WWW Document], 2023. URL
https://docs.ccxt.com/#/?id=rate-limit (accessed 4.29.23).
[22] Chordia, T., Roll, R., Subrahmanyam, A., 2008. Liquidity and market efficiency.
Journal of Financial Economics 87, 249268.
https://doi.org/10.1016/j.jfineco.2007.03.005
[23] Ciaian, P., Rajcaniova, M., Kancs, d’Artis, 2016. The economics of BitCoin price
formation. Applied Economics 48, 17991815.
https://doi.org/10.1080/00036846.2015.1109038
[24] Clements, R., 2021. Built to Fail: The Inherent Fragility of Algorithmic Stablecoins.
SSRN Journal. https://doi.org/10.2139/ssrn.3952045
[25] Coinbase pricing and fees disclosures [WWW Document], 2023. . Coinbase Help.
URL https://help.coinbase.com/en/coinbase/trading-and-funding/pricing-and-fees/fees
(accessed 3.30.23).
[26] CoinGecko API Pricing Plans [WWW Document], 2023. . CoinGecko. URL
https://www.coingecko.com/en/api/pricing (accessed 4.29.23).
[27] CoinMarketCap, 2023. CoinMarketCap API Pricing [WWW Document].
coinmarketcap.com. URL https://coinmarketcap.com/api/pricing/ (accessed 4.29.23).
[28] Do, B., Faff, R., Hamza, K., 2006. A New Approach to Modeling and Estimation for
Pairs Trading. Proceedings of 2006 Financial Management Association European
Conference.
[29] Duan, K., Li, Z., Urquhart, A., Ye, J., 2021. Dynamic efficiency and arbitrage potential
in Bitcoin: A long-memory approach. International Review of Financial Analysis 75,
101725. https://doi.org/10.1016/j.irfa.2021.101725
[30] Dwyer, G.P., 2015. The economics of Bitcoin and similar private digital currencies.
Journal of Financial Stability 17, 8191. https://doi.org/10.1016/j.jfs.2014.11.006
[31] Egorova, K., 2018. Crypto Exchanges, Explained [WWW Document]. Cointelegraph.
URL https://cointelegraph.com/explained/crypto-exchanges-explained (accessed
3.31.23).
[32] Fama, E.F., 1970. Efficient Capital Markets: A Review of Theory and Empirical Work.
The Journal of Finance 25, 383. https://doi.org/10.2307/2325486
[33] Fee Rate [WWW Document], 2023. . Binance. URL https://www.binance.com
(accessed 3.30.23).
[34] Fee Structures | Explore our trading fees | Kraken [WWW Document], 2023. URL
https://www.kraken.com/features/fee-schedule (accessed 3.30.23).
[35] Fernández-Pérez, A., Fernández-Rodríguez, F., Sosvilla-Rivero, S., 2012. Genetic
Algorithm for Arbitrage with More than Three Currencies. TI 03, 181186.
https://doi.org/10.4236/ti.2012.33025
[36] Fischer, T., Krauss, C., Deinert, A., 2019. Statistical Arbitrage in Cryptocurrency
Markets. JRFM 12, 31. https://doi.org/10.3390/jrfm12010031
[37] Fontana, C., 2015. Weak and strong no-arbitrage conditions for continuous financial
markets. Int. J. Theor. Appl. Finan. 18, 1550005.
https://doi.org/10.1142/S0219024915500053
[38] Foucault, T., Kadan, O., Kandel, E., 2005. Limit Order Book as a Market for Liquidity.
The Review of Financial Studies 18, 11711217.
[39] Fu, S., Wang, Q., Yu, J., Chen, S., 2022. FTX Collapse: A Ponzi Story.
https://doi.org/10.48550/ARXIV.2212.09436
[40] Glossary of Trading Terms [WWW Document], 2023. . CryptoCompare. URL
https://www.cryptocompare.com/coins/guides/glossary-of-trading-terms/ (accessed
4.30.23).
[41] Goldenberg, T., 2018. Watch Out Crypto Exchanges, Decentralization Is Coming
[WWW Document]. URL https://www.coindesk.com/markets/2018/05/31/watch-out-
crypto-exchanges-decentralization-is-coming/ (accessed 3.27.23).
[42] Gould, M.D., Porter, M.A., Williams, S., McDonald, M., Fenn, D.J., Howison, S.D.,
2013. Limit order books. Quantitative Finance 13, 17091742.
https://doi.org/10.1080/14697688.2013.803148
[43] Gromb, D., Vayanos, D., 2018. The Dynamics of Financially Constrained Arbitrage:
The Dynamics of Financially Constrained Arbitrage. The Journal of Finance 73,
17131750. https://doi.org/10.1111/jofi.12689
[44] Gromb, D., Vayanos, D., 2002. Equilibrium and welfare in markets with financially
constrained arbitrageurs. Journal of Financial Economics 66, 361407.
https://doi.org/10.1016/S0304-405X(02)00228-3
[45] Hautsch, N., Scheuch, C., Voigt, S., 2018. Limits to Arbitrage in Markets With
Stochastic Settlement Latency. SSRN Journal. https://doi.org/10.2139/ssrn.3302159
[46] Heckel, M., Waldenberger, F. (Eds.), 2022. The Future of Financial Systems in the
Digital Age: Perspectives from Europe and Japan, Perspectives in Law, Business and
Innovation. Springer Singapore, Singapore. https://doi.org/10.1007/978-981-16-7830-
1
[47] Hevner, A., Chatterjee, S., 2010. Design Science Research in Information Systems,
in: Design Research in Information Systems, Integrated Series in Information
Systems. Springer US, Boston, MA, pp. 922. https://doi.org/10.1007/978-1-4419-
5653-8_2
[48] Hevner, March, Park, Ram, 2004. Design Science in Information Systems Research.
MIS Quarterly 28, 75. https://doi.org/10.2307/25148625
[49] Holste, B., Gallus, C., 2019. Sind Krypto-Währungsmärkte Fair? (Are Crypto-
Currency Markets Fair?). SSRN Journal. https://doi.org/10.2139/ssrn.3466919
[50] Isard, P., 1976. How Far Can We Push The “Law of One Price”? Int. finance discuss.
pap. 1976, 122. https://doi.org/10.17016/IFDP.1976.84
[51] Jensen, M.C., 2002. Some Anomalous Evidence Regarding Market Efficiency. SSRN
Journal. https://doi.org/10.2139/ssrn.244159
[52] Jofre, A., Rockafellar, R.T., Wets, R.J.-B., 2014. General Economic Equilibrium with
Financial Markets and Retainability. SSRN Journal.
https://doi.org/10.2139/ssrn.2460128
[53] Johnstone, S., 2019. Requisites for Development of a Regulated Secondary Market
in Digital Assets. SSRN Journal. https://doi.org/10.2139/ssrn.3379623
[54] Kabašinskas, A., Šutienė, K., 2021. Key Roles of Crypto-Exchanges in Generating
Arbitrage Opportunities. Entropy 23, 455. https://doi.org/10.3390/e23040455
[55] Kakushadze, Z., Yu, W., 2019. Altcoin-Bitcoin Arbitrage. SSRN Journal.
https://doi.org/10.2139/ssrn.3327524
[56] Keim, D.B., Madhavan, A., 1997. Transactions costs and investment style: an inter-
exchange analysis of institutional equity trades. Journal of Financial Economics 46,
265292. https://doi.org/10.1016/S0304-405X(97)00031-7
[57] Kiuchi, T., 2022. High-Frequency Trading in Japan: A Unique Evolution, in: Heckel,
M., Waldenberger, F. (Eds.), The Future of Financial Systems in the Digital Age,
Perspectives in Law, Business and Innovation. Springer Singapore, Singapore, pp.
159183. https://doi.org/10.1007/978-981-16-7830-1_9
[58] Kristoufek, L., 2013. BitCoin meets Google Trends and Wikipedia: Quantifying the
relationship between phenomena of the Internet era. Sci Rep 3, 3415.
https://doi.org/10.1038/srep03415
[59] Krückeberg, S., Scholz, P., 2020. Decentralized Efficiency? Arbitrage in Bitcoin
Markets. Financial Analysts Journal 76, 135152.
https://doi.org/10.1080/0015198X.2020.1733902
[60] Kühl, M., 2010. Bivariate cointegration of major exchange rates, cross-market
efficiency and the introduction of the Euro. Journal of Economics and Business 62, 1
19. https://doi.org/10.1016/j.jeconbus.2009.07.002
[61] Kyle, A.S., 1985. Continuous Auctions and Insider Trading. Econometrica 53, 1315.
https://doi.org/10.2307/1913210
[62] Lee, S., Meslmani, N.E., Switzer, L.N., 2020. Pricing Efficiency and Arbitrage in the
Bitcoin Spot and Futures Markets. Research in International Business and Finance
53, 101200. https://doi.org/10.1016/j.ribaf.2020.101200
[63] Levus, R., Berko, A., Chyrun, L., Panasyuk, V., Hrubel, M., 2021. Intelligent System
for Arbitrage Situations Searching in the Cryptocurrency Market, in: MoMLeT+DS.
[64] Liu, G., Yu, C.-P., Shiu, S.-N., Shih, I.-T., 2022. The Efficient Market Hypothesis and
the Fractal Market Hypothesis: Interfluves, Fusions, and Evolutions. SAGE Open 12,
215824402210821. https://doi.org/10.1177/21582440221082137
[65] Makarov, I., Schoar, A., 2020. Trading and arbitrage in cryptocurrency markets.
Journal of Financial Economics 135, 293319.
https://doi.org/10.1016/j.jfineco.2019.07.001
[66] Malinova, K., Park, A., 2011. Subsidizing Liquidity: The Impact of Make/Take Fees on
Market Quality. SSRN Journal. https://doi.org/10.2139/ssrn.1944054
[67] Mohan, V., 2022. Automated market makers and decentralized exchanges: a DeFi
primer. Financ Innov 8, 20. https://doi.org/10.1186/s40854-021-00314-5
[68] Nakamoto, S., 2008. Bitcoin: A Peer-to-Peer Electronic Cash System.
[69] O’Hara, M., 2015. High frequency market microstructure. Journal of Financial
Economics 116, 257270. https://doi.org/10.1016/j.jfineco.2015.01.003
[70] Ozenbas, D., Pagano, M.S., Schwartz, R.A., Weber, B.W., 2022. Liquidity, Markets
and Trading in Action: An Interdisciplinary Perspective, Classroom Companion:
Business. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-
030-74817-3
[71] Parlour, C.A., Seppi, D.J., 2008. Limit Order Markets: A Survey, in: Handbook of
Financial Intermediation and Banking. Elsevier, pp. 6396.
https://doi.org/10.1016/B978-044451558-2.50007-6
[72] Pauna, C., 2018. Arbitrage Trading Systems for Cryptocurrencies. Design Principles
and Server Architecture. IE 22, 3542.
https://doi.org/10.12948/issn14531305/22.2.2018.04
[73] Poitras, G., 2010. Arbitrage: Historical Perspectives, in: Cont, R. (Ed.), Encyclopedia
of Quantitative Finance. John Wiley & Sons, Ltd, Chichester, UK, p. eqf01010.
https://doi.org/10.1002/9780470061602.eqf01010
[74] Pourpounehnajafabadi, M., Nielsen, K., Ross, O., 2020. Automated Market Makers.
Department of Food and Resource Economics, University of Copenhagen IFRO
Working Paper.
[75] Pricing | CryptoCompare API [WWW Document], 2023. . CryptoCompare. URL
https://min-api.cryptocompare.com/pricing (accessed 4.29.23).
[76] Ross, S.A., 1976. The arbitrage theory of capital asset pricing. Journal of Economic
Theory 13, 341360. https://doi.org/10.1016/0022-0531(76)90046-6
[77] Saengchote, K., 2021. A DeFi Bank Run: Iron Finance, IRON Stablecoin, and the Fall
of TITAN. SSRN Journal. https://doi.org/10.2139/ssrn.3888089
[78] Schär, F., 2020. Decentralized Finance: On Blockchain- and Smart Contract-based
Financial Markets. SSRN Journal. https://doi.org/10.2139/ssrn.3571335
[79] Total Cryptocurrency Market Cap [WWW Document], 2023. . CoinMarketCap. URL
https://coinmarketcap.com/charts/ (accessed 4.25.23).
[80] Urquhart, A., 2016. The inefficiency of Bitcoin. Economics Letters 148, 8082.
https://doi.org/10.1016/j.econlet.2016.09.019
[81] Websocket API | Binance Developers [WWW Document], 2023. URL
https://developers.binance.com/docs/binance-trading-api/websocket_api (accessed
2.4.23).
[82] Wei, W.C., 2018. Liquidity and market efficiency in cryptocurrencies. Economics
Letters 168, 2124. https://doi.org/10.1016/j.econlet.2018.04.003
[83] Weron, A., Weron, R., 2000. Fractal market hypothesis and two power-laws. Chaos,
Solitons & Fractals 11, 289296. https://doi.org/10.1016/S0960-0779(98)00295-1
[84] Which API should I use? REST versus WebSocket [WWW Document], 2023. .
Kraken. URL https://support.kraken.com/hc/en-us/articles/4404197772052-Which-
API-should-I-use-REST-versus-WebSocket (accessed 2.4.23).
[85] Zhang, W., Wang, P., Li, X., Shen, D., 2018. The inefficiency of cryptocurrency and
its cross-correlation with Dow Jones Industrial Average. Physica A: Statistical
Mechanics and its Applications 510, 658670.
https://doi.org/10.1016/j.physa.2018.07.032
List of Figures
Figure 1: Visualization of a two-point arbitrage trading process ............................................ 19
Figure 2: Visualization of a triangular arbitrage trading process ............................................ 21
Figure 3: All observed price fluctuations occur due to shifts in demand (Buchholz et al.,
2012). ..................................................................................................................................... 26
Figure 4: Schematic functionality of a Limit Order Book System (Gould et al., 2013) ........... 28
Figure 5: System development research model according to Design Science by Hefner.
Adopted from Nunamaker (Hevner and Chatterjee, 2010) .................................................... 36
Figure 6: Preferred exchanges, resulting from top 25 exchange rankings by platforms trust
score. ..................................................................................................................................... 43
Figure 7: File structure of gathered historical data. ................................................................ 44
Figure 8: Process of gathering data from exchanges for cryptocompare.com and CCXT .... 45
Figure 9: Architectural approach with three actors and their corresponding data streams .... 52
Figure 10: File structure of the arbitrage trading prototype .................................................... 53
Figure 11: Data example of a Binance for ETH/EUR over the specified timeframe. ............. 62
Figure 12: Arbitrage Index for all 48 assets overall their available exchanges ...................... 63
Figure 13: Arbitrage Index for the top 10 assets (by Market cap) overall their available
exchanges .............................................................................................................................. 64
Figure 14: Corresponding table with statistical insights about the arbitrage Index for the top
10 assets (by market cap) ...................................................................................................... 64
Figure 15: Arbitrage Index for the 10 assets with the lowest market cap of the considered list,
overall their available exchanges ........................................................................................... 65
Figure 16: Corresponding table with statistical insights about the arbitrage Index for the
lowest 10 assets of the considered list (by market cap) ........................................................ 65
Figure 17: Arbitrage Index for the 10 assets with the highest mean, overall their available
exchanges .............................................................................................................................. 66
Figure 18: Corresponding table with statistical insights about the arbitrage Index for the 10
assets with the highest mean ................................................................................................. 66
Figure 19: Arbitrage Index for the 10 assets with the lowest mean, overall their available
exchanges .............................................................................................................................. 67
Figure 20: Corresponding table with statistical insights about the arbitrage Index for the 10
assets with the lowest mean .................................................................................................. 67
Figure 21: All 48 assets, sorted by the arbitrage index mean value of the last 16 months. ... 68
Figure 22: Relative Price Differences to the mean price for 1INCH ....................................... 70
Figure 23: Price Differences to the mean price table for 1INCH ............................................ 70
Figure 24: Relative Price Differences to the mean price for ALGO ....................................... 71
Figure 25: Price Differences to the mean price table for ALGO ............................................ 71
Figure 26: Relative Price Differences to the mean price for ATOM ....................................... 72
Figure 27: Price Differences to the mean price table for ATOM ............................................ 72
Figure 28: Relative Price Differences to the mean price for MANA ....................................... 73
Figure 29: Price Differences to the mean price table for MANA ............................................ 73
Figure 30: Relative Price Differences to the mean price for XTZ ........................................... 74
Figure 31: Price Differences to the mean price table for XTZ ................................................ 74
Figure 32: Example of the prototype logging to csv, where arbitrage opportunities were
found. ..................................................................................................................................... 75
List of Tables
Table 1: Research Questions and used methods. ................................................................... 9
Table 2: Keyword search results ............................................................................................ 12
Table 3: Top 100 assets by market cap considered at the beginning with abbreviation and full
name. ..................................................................................................................................... 41
Table 4: 85 available exchanges from cryptocompare.com and CCXT with abbreviation and
full name. ............................................................................................................................... 42
Table 5: Scraped Websites with exchange ranking by Trust Scores, Exchange Scores or
Points. .................................................................................................................................... 43
Table 6: 48 available assets of the 100 considered, with abbreviation and full name. .......... 46
Table 7: Exchanges with false data by occurrences .............................................................. 47
Table 8: The 48 assets, for which data was available, with abbreviation and full name ........ 61
Table 9: The 16 remaining exchanges from which data was available, with abbreviation and
full name. ............................................................................................................................... 61
List of Abbreviations
API
Application Programming Interface
APT
Arbitrage Pricing Theory
ATS
Alternative Trading Service
BTC
Bitcoin
CEX
Centralised Exchange
DEX
Decentralised Exchange
EMH
Efficient Market Hypothesis
ETH
Ethereum
EUR
Euro
GUI
Graphical User Interface
I/O
Input/Output
ICO
Initial Coin Offering
LOB
Limit Order Book
OHLCV
Open, High, Low, Close, Volume
OTC
Over-the-counter Market
REST
Representational State Transfer
RQ
Research Question
SPOF
Single Point of Failure
TCP
Transmission Control Protocol
VWAP
Volume Weighted Average Price
Attachment A: Data gathering from cryptocompare.com
def historicalData_cryptocompare(fsym, tsym, limit, start_timestamp, api_url, exchange):
symbol = f"{fsym}_{tsym}"
folder_name = f"data/historical_{symbol}"
# create the folder if it doesn't exist and set the filename
if not os.path.exists(folder_name):
os.makedirs(folder_name)
filename = f"{folder_name}/{exchange}_{symbol}_h.csv"
# Create csv and write the header
with open(filename, mode='w', newline='') as csv_file:
fieldnames = ['time', 'open', 'high', 'low', 'close', 'volumefrom', 'volumeto']
writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
writer.writeheader()
to_ts = None
file_created = True
while True:
current_url = f"{api_url}&e={exchange}&fsym={fsym}&tsym={tsym}&limit={limit}"
# check if it is the first request
if to_ts is not None:
current_url += f"&toTs={to_ts}"
# get the response from the API
response = requests.get(current_url)
ohlcv_data = response.json()
# if the request throws an error, the trading pair is not available and get's skipped.
# Also, the created file is removed
if ohlcv_data.get("Response") == "Error":
print(f"Skipping {exchange}: { ohlcv_data.get('Message') }")
file_created = False
os.remove(filename)
break; return
data_points = ohlcv_data['Data']['Data']
earliest_timestamp = ohlcv_data['Data']['TimeFrom']
# Write the data to csv for every request
with open(filename, mode='a', newline='') as csv_file:
fieldnames = ['time', 'open', 'high', 'low', 'close', 'volumefrom', 'volumeto']
writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
for data_point in data_points:
writer.writerow({k: data_point[k] for k in fieldnames})
# Break the while loop if the beginning of the historical data is reached
if earliest_timestamp <= start_timestamp:
break
to_ts = earliest_timestamp
if file_created:
# Clean csv and remove rows outside the specified time range
df = pd.read_csv(filename)
df['time'] = df['time'].astype(int)
df = df.sort_values(by='time')
df = df[(df['time'] >= start_timestamp) & (df['time'] <= end_timestamp)]
df = df.sort_values(by='time')
df = df.drop_duplicates(subset='time', keep='first')
df.to_csv(filename, index=False)
Attachment B: Data gathering from CCXT Library
def historicalData_ccxt(exchange_id, symbol, timeframe='1h', start_date=None, end_date=None):
# get exchange data
exchange = getattr(ccxt, exchange_id)()
# check if the exchange offers OHLCV data
if exchange.has['fetchOHLCV']:
try:
if start_date and end_date:
since = exchange.parse8601(start_date)
until = exchange.parse8601(end_date)
ohlcv = []
while since < until:
ohlcv_data = exchange.fetch_ohlcv(symbol, timeframe, since)
if not ohlcv_data:
break
since = ohlcv_data[-1][0] + 1
ohlcv += ohlcv_data
time.sleep(exchange.rateLimit / 1000)
# create the folder if it doesn't exist and set the filename
folder_name = f"data/historical_{symbol.replace('/', '_')}"
if not os.path.exists(folder_name):
os.makedirs(folder_name)
filename = f"{folder_name}/{exchange_id}_{symbol.replace('/', '_')}_h.csv"
# Convert to dataframe, clean csv and remove rows outside the specified time range
df = pd.DataFrame(ohlcv, columns=["time", "open", "high", "low", "close",
"volume"])
df['time'] = df['time'] / 1000
df['time'] = df['time'].astype(int)
df = df[(df['time'] >= start_timestamp) & (df['time'] <= end_timestamp)]
df = df.sort_values(by='time')
df = df.drop_duplicates(subset='time', keep='first')
# Write the data to csv
df.to_csv(filename, index=False)
print(f"Historical data saved for {symbol} on {exchange_id}")
except Exception as message:
print(f"Error for {symbol} on {exchange_id}: {message}")
else:
print(f"{exchange_id} does not support fetchOHLCV")